首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >"DNA突变“博弈机制的Monte模拟

"DNA突变“博弈机制的Monte模拟
EN

Code Review用户
提问于 2020-04-06 21:43:21
回答 1查看 123关注 0票数 1

游戏空间站13号(位于天堂空间站代码库上)包含一个DNA机修工。每个带有DNA的碳基生命形态都有55个“块”,可以在DNA修饰剂中修改,一旦达到某一阈值,就会使暴民有残疾或超能力。

DNA片段是从000FFF的3位碱基-16个数字.对于每种突变,激活阈值是不同的(残疾和小功率具有较低的阈值),但是如果一个块被设置为十六进制DAC或更高的值,则该块上的所有突变都保证是活动的。

下面是DNA修改过程的一个例子:

  1. 一个生物进入DNA修饰符。
  2. 遗传学家看着这个生物的第一个DNA片段。它被设置为357
  3. 遗传学家选择第一个数字块,3,并照射它。
  4. 将数字3替换为数字9
  5. 957低于DAC的阈值。遗传学家再次照射第一个数字,直到它至少是D
  6. 如果第一个数字高于D,则进程结束。如果是在D,遗传学家会照射这个区块,直到它在A或更高的位置。然后,第三个街区,直到它在C或更高。
  7. 此过程将继续进行,直到达到DAC或更高的值为止。(如果第一次辐射导致了E57,那就早结束了。)

我想知道,平均而言,要对DAC照射一个街区需要多少次尝试。并不是每一个数字都能转换成其他的数字,而且概率表也不一致。幸运的是,我可以访问天堂站的源代码,因此我可以准确地模拟这个过程。

这个游戏是用一种相当笨重的语言写成的,叫做DreamMaker --我真的不知道如何使用它。负责辐照单个光的函数称为miniscramble,其源代码如下:(稍微修改一下以提高可读性,定义原来并不存在)

代码语言:javascript
复制
#define HIGH_SCRAMBLE prob((rs*10))
#define MED_SCRAMBLE  prob((rs*10)-(rd))
#define LOW_SCRAMBLE  prob((rs*5)+(rd)))
/proc/miniscramble(input,rs,rd)
    var/output
    output = null
    if(input == "C" || input == "D" || input == "E" || input == "F")
        output = pick(HIGH_SCRAMBLE;"4",HIGH_SCRAMBLE;"5",HIGH_SCRAMBLE;"6",HIGH_SCRAMBLE;"7",LOW_SCRAMBLE;"0",LOW_SCRAMBLE;"1",MED_SCRAMBLE;"2",MED_SCRAMBLE;"3")
    if(input == "8" || input == "9" || input == "A" || input == "B")
        output = pick(HIGH_SCRAMBLE;"4",HIGH_SCRAMBLE;"5",HIGH_SCRAMBLE;"A",HIGH_SCRAMBLE;"B",LOW_SCRAMBLE;"C",LOW_SCRAMBLE;"D",LOW_SCRAMBLE;"2",LOW_SCRAMBLE;"3")
    if(input == "4" || input == "5" || input == "6" || input == "7")
        output = pick(HIGH_SCRAMBLE;"4",HIGH_SCRAMBLE;"5",HIGH_SCRAMBLE;"A",HIGH_SCRAMBLE;"B",LOW_SCRAMBLE;"C",LOW_SCRAMBLE;"D",LOW_SCRAMBLE;"2",LOW_SCRAMBLE;"3")
    if(input == "0" || input == "1" || input == "2" || input == "3")
        output = pick(HIGH_SCRAMBLE;"8",HIGH_SCRAMBLE;"9",HIGH_SCRAMBLE;"A",HIGH_SCRAMBLE;"B",MED_SCRAMBLE;"C",MED_SCRAMBLE;"D",LOW_SCRAMBLE;"E",LOW_SCRAMBLE;"F")
    if(!output) output = "5"
    return output

我的程序模拟每个块从000-800 ( 800以上的值在一轮开始时不会出现在生物中)到至少DAC,使用上面概述的策略。它是用Python 3编写的,并输出一个CSV文件,其中包含两列:正在处理的块,以及到达DAC所需的时间(平均为4096次)。miniscramble_table函数是我对上面DreamMaker代码的直接翻译。

代码语言:javascript
复制
from random import random
from functools import reduce
import itertools


print("This program will calculate how many attempts it takes to get to DAC from every block 000-800, on average.")
block_attempts = input("How many attempts should be made per block? (default 4096, more is more accurate)")

if not block_attempts:
    block_attempts = 4096
else:
    try:
        block_attempts = int(block_attempts)
    except ValueError:
        print("ERR: Could not parse that number. Using 4096.")
user_bound = input("What should be the upper bound on the dataset? (default 800, lower is faster but less comprehensive)")
if not user_bound:
    user_bound = '800'
else:
    try:
        user_bound = [char for char in user_bound]
        assert(len(user_bound) == 3)
    except:
        print("ERR: Could not parse that bound. Using 800.")

user_target = input("What should be the target value for a block? (default DAC)")
if not user_target:
    user_target = ['D','A','C']
else:
    try:
        user_target = [char for char in user_target]
        assert(len(user_target) == 3)
    except: # bad, but like, c'mon
        print("ERR: Could not parse that bound. Using 800.")

# Generate a probability table. This is ugly because it's based off BYOND code.
def miniscramble_table(letter, rad_strength, rad_duration):
  HIGH_SCRAMBLE = rad_strength*10
  MED_SCRAMBLE  = rad_strength*10 - rad_duration
  LOW_SCRAMBLE  = rad_strength*5 + rad_duration

  picks = (("5"),(1.0)) # default, I guess.
  if (letter in ["C",  "D", "E", "F"]):
    picks = ("4", "5", "6", "7", "0", "1", "2", "3")
    probs = (*[HIGH_SCRAMBLE] * 4, *[LOW_SCRAMBLE] * 2, *[MED_SCRAMBLE] * 2)

  if (letter in ["8",  "9", "A", "B", "4", "5", "6", "7"]):
    picks = ("4", "5", "A", "B", "C", "D", "2", "3")
    probs = (*[HIGH_SCRAMBLE] * 4, *[LOW_SCRAMBLE] * 4)

  if (letter in ["0",  "1", "2", "3"]):
    picks = ("8", "9", "A", "B", "C", "D", "E", "F")
    probs = (*[HIGH_SCRAMBLE] * 4, *[MED_SCRAMBLE] * 2, *[LOW_SCRAMBLE] * 2)

  total = sum(probs)
  probs = map(lambda n: n/total, probs) # sums to 1

  # make the output nicer to work with...
  out = []
  prev = 0
  for pick, prob in zip(picks, probs):
    out.append((pick, prob+prev))
    prev += prob
  return out

def miniscramble(letter, rad_strength, rad_duration):
  r = random()
  table = miniscramble_table(letter, rad_strength, rad_duration)
  output = filter(lambda entry: entry[1] >= r, table)
  # print(r)
  return list(output)[0][0]

# tries to get from `initial` to at least `letters` with specified settings
# returns # of attempts to get there.
def scramble_to(initial=('3','5','7'), target=user_target, settings=(10,2), log=False):
  current = list(initial) # what are we looking at
  # letter-iterable to base10 number
  def concat_letters(letters): 
    return int(reduce(lambda x,y: x+y, letters, ''), 16)

  for attempts in enumerate(itertools.repeat(0)):
    if log: print(f'Miniscramble #{attempts[0]}:', ''.join(current))
    if concat_letters(current) >= concat_letters(target):
      if log: print(f'Done with {attempts[0]} miniscrambles!')
      return attempts[0] # done, since we're above/at the target!
    for i in range(3):
      if int(current[i], 16) < int(target[i], 16):
        current[i] = miniscramble(current[i], *settings)
        break # 1 `enumerate` per attempt

results = {}
def unactivated(seq):
  return int(''.join(seq), 16) < int(user_bound, 16) # blocks never start activated, so default is 800

dataset = filter(unactivated, (seq for seq in itertools.product([_ for _ in '0123456789ABCDEF'], repeat=3)))
for block in dataset: # go through a sample set of blocks, 54 in all
  # Give each block lots of attempts for bigger sample size. default=4096
  intermediate = []
  for _ in range(block_attempts):
    intermediate.append(scramble_to(initial=block, target=('D','A','C'), settings=(1,2)))
  results[block] = (sum(intermediate)/len(intermediate)) # average it out

# Convert results to CSV

out = []
for k,v in results.items():
    out.append(f'{"".join(k)},{v}\n')
filename = input('\nDone. Where should the results be saved? (leave blank to not save) ')
if filename:
    with open(filename, 'w') as outfile:
        [outfile.write(line) for line in out]
else:
    print("Not saving.")

我使用Excel绘制了下面的结果图:

问题:

  • 我还有什么能做得更好的?
  • 是否有库可以使这种模拟更容易编写,或者更快?
  • 现在生成输出需要相当长的时间:是否有任何明显的优化需要进行?让这个异步的、派生的工作线程在Cython中编译会有好处吗?
  • 如何使用像matplotlib这样的库生成类似的图?
EN

回答 1

Code Review用户

回答已采纳

发布于 2020-04-09 23:44:59

这个问题是一个“吸收马氏链”问题,可以解析地解决所期望的步骤数。

马尔可夫链具有对应于每个DNA块的节点或状态。miniscramble例程以及DNA修饰过程的步骤可以用来定义状态之间的转换概率。例如,0x000可以转换为0x100,0x200,0x300,.(只有第一个数字的变化)。类似地,0xD 05可以转到0xD 15 ...0xDF5 (只有第二个数字变化),依此类推。任何节点>= 0xDAC都是吸收节点。

代码可能更简洁,但它说明了这一点。

代码语言:javascript
复制
import numpy as np
import matplotlib.pyplot as plt

def make_transition_table(rad_strength, rad_duration):
    # Hi, Med, Lo transition weights
    H = rad_strength*10
    M = rad_strength*10 - rad_duration
    L = rad_strength*5 + rad_duration

    transition_probability = []

    # for digits 0, 1, 2, 3
    # picks    0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
    weights = [0, 0, 0, 0, 0, 0, 0, 0, H, H, H, H, M, M, L, L]
    total = sum(weights)
    probabilities = [w/total for w in weights]
    transition_probability.extend(probabilities for _ in '0123')

    # for digits 4, 5, 6, 7, 8, 9, A, B
    # picks    0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
    weights = [0, 0, L, L, H, H, 0, 0, 0, 0, H, H, L, L, 0, 0]
    total = sum(weights)
    probabilities = [w/total for w in weights]
    transition_probability.extend(probabilities for _ in '456789AB')

    # for digits C, D, E, F:
    #picks     0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
    weights = [L, L, M, M, H, H, H, H, 0, 0, 0, 0, 0, 0, 0, 0]
    total = sum(weights)
    probabilities = [w/total for w in weights]
    transition_probability.extend(probabilities for _ in 'CDEF')

    return transition_probability


rad_strength = 1
rad_duration = 2
transition = make_transition_table(rad_strength, rad_duration)

# build table of all transitions
# P[i][j] = prob to go from i to j
P = []

for i in range(0xFFF + 1):
    d0, rem = divmod(i, 0xFF)
    d1, d2 = divmod(rem, 0xF)

    row = [0]*4096

    if d0 < 0xD:
        start = d1*0xF + d2
        for c, j in enumerate(range(start, start + 0xF00 + 1, 0x100)):
            row[j] = transition[d0][c]

    elif d0 == 0xD:
        if d1 < 0xA:
            start = d0 * 0xFF + d2
            for c, j in enumerate(range(start, start + 0xF0 + 1, 0x10)):
                row[j] = transition[d1][c]

        elif d1 == 0xA:
            if d2 < 0xC:
                start = d0 * 0xFF + d1 * 0xF
                for c, j in enumerate(range(start, start + 0xF + 1, 0x1)):
                    row[j] = transition[d2][c]

    P.append(row)

# convert to numpy array to do to more easily 
# select Q and do the matrix math
P = np.array(P)

Q = P[:0xDAB,:0xDAB]

I = np.identity(Q.shape[0])

N = np.linalg.inv(I - Q)

# this is the same a N*1 as shown in the Wikipedia article
avg_steps = np.sum(N, axis=1)

# change indices for avg_steps to view different
# ranges of starting points
plt.plot(avg_steps[:0x801])
票数 2
EN
页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://codereview.stackexchange.com/questions/240073

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档