文章/答案/技术大牛

发布

问策略:狡猾的表亲女巫
EN

Code Golf用户

提问于 2022-09-01 12:38:47

回答 3查看 600关注 0票数 5

巫师有一个狡猾的表弟，他是个女巫。她看不起巫师，认为他和他的谜题在数学上是天真的。在阅读他最新的拼图时，她嘲笑他总是用她(不公平)描述的简单的解决方案来问离散的问题，而真正的、恰当的问题应该是连续的。为了证明她的观点，她提出了以下版本的巫师的谜题。(他不情愿地允许部分剽窃。)

请考虑以下设置。一个狡猾的女巫有一个从0到10的实数线，这是隐藏在你面前的。另外，她选择了一个随机整数x \in \{0, \dots, 10\}，并将多个点均匀地放置在数字线上。更准确地说，她将每个x点独立而一致地随机放置在数字线上。你的任务是证明，x = 10和如果你这样做，女巫会答应你，她的承诺是一个比她的堂兄所能提供的更好的愿望。

在这个游戏中，你可以在每一步选择一个浮点数y，女巫会告诉你数字线上值小于或等于y的点数。

然而，女巫，至少和她的表妹一样邪恶，不会让你选择一个比9更大的数字。这可能仍然是可以的，因为你仍然可以找到10分，事实上，唯一满足愿望的方法是找到所有的10点，值9或更少。

选择浮点数y的成本是2^{y}美元。在任何时候，您都可以选择放弃这组点，让她重新开始整个过程(使用一个新的随机x)。当然，如果你选择了数字9，你仍然没有找到10分，你别无选择，只能放弃，重新开始。但如果你选择了一个小于9的数字，你可能会想放弃。遗憾的是，你永远得不到任何回报，所以你的成本只会继续下去。

你的目标是设计一个策略，以最低的预期成本实现你的愿望。你应该报告你的平均成本。

测试

一旦你选择了你的策略，你应该运行它，直到你得到愿望10,000次，并报告平均成本。如果两个答案有相同的策略，发布的第一个就赢了。如果两种策略具有相似的平均成本，您可能需要对其进行100,000次甚至更多次的测试才能看出差异。当然，如果你能直接计算出预期的成本，那就更好了。

输入输出

在这一挑战中没有外部投入。输出只是得到一个愿望的平均成本。要测试您的代码，您需要同时实现女巫和您的策略。

什么是天真的分数？

如果你每次只选择9点，那么\frac{1}{\frac{1}{11} \cdot \frac{9}{10}^{10}} \approx 31.5就会试图找出10分。这将花费您大约16152.4美元。你能做得更好吗？

Notes

在计算代码时，您可以假设浮点数是精确的数学实数，并且您最喜欢的库中的随机数生成器是完美的(但请不要使用C中的默认兰德()，这非常糟糕)。

code-challenge

回答 3

Code Golf用户

发布于 2022-09-01 19:48:15

Python，得分= 2610.6431 (2610.5938有400个猜测值)

在一组相同间隔的可能猜测(默认情况下为200)上找到最优策略。

以下是每一个发现点的新猜测和旧猜测的图表；0的猜测意味着重置。你想怎么做就怎么做。

import itertools
import random

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import binom

bound = 16153
samples = 200

g = np.linspace(0, 9, samples)

def logsumexp(x):
    m = x.max()
    return np.log(np.sum(np.exp(x - m))) + m

def init_trans(samples):
    print("computing transition matrix...\n")
    trans = np.full((samples, 10, samples, 11), float('-inf'))
    g = np.linspace(0, 9, samples)
    for i0 in range(samples - 1):
        print(f"computing row {i0} of {samples}", end='\r', flush=True)
        g0 = g[i0]
        for n in range(10):
            k = np.arange(n, 11)
            u = np.zeros(k.shape) if i0 == 0 else binom(k, g0 / 10).logpmf(n)
            for i1 in range(i0 + 1, samples):
                g1 = g[i1]
                d = binom(k - n, (g1 - g0) / (10 - g0))
                for m in range(n, 11):
                    v = d.logpmf(m - n)
                    trans[i0,n,i1,m] = logsumexp(u + v)
    print()
    trans -= trans.max(axis=-1, keepdims=True)
    trans = np.exp(trans)
    trans /= trans.sum(axis=-1, keepdims=True)
    return trans

trans_file= f'trans_{samples}.npy'
try:
    trans = np.load(trans_file)
except FileNotFoundError:
    trans = init_trans(samples)
    np.save(trans_file, trans)

def get_policy(trans):

    print("estimating policy...")
    values_file = f"values_{samples}.npy"
    policy_file = f"values_{samples}.npy"
    grad = np.zeros((samples, 11))
    values = np.zeros((samples, 11))
    policy = np.zeros((samples, 11), dtype=np.int64)
    x = bound
    while True:
        values[-1,:-1] = x
        grad[-1,:-1] = 1
        for i0 in range(samples - 2, -1, -1):
            for n in range(10):
                g1 = g[i0+1:]
                v1 = 2 ** g1 + np.einsum('ij, ij -> i', trans[i0,n,i0+1:,n:], values[i0+1:,n:])
                di = np.argmin(v1)
                i1 = i0 + di + 1
                if v1[di] <= x:
                    values[i0,n] = v1[di]
                    grad[i0,n] = np.sum(trans[i0,n,i1,n:] * grad[i1,n:])
                    policy[i0,n] = i1
                else:
                    values[i0,n] = x
                    grad[i0,n] = 1
                    policy[i0,n] = 0
        if values[0,0] >= x:
            break
        x += (values[0,0] - x) / (1 - grad[0,0])
    print(f"computed cost = {x}")
    return values, policy

values_file = f"values_{samples}.npy"
policy_file = f"policy_{samples}.npy"
try:
    values, policy = np.load(values_file), np.load(policy_file)
except FileNotFoundError:
    values, policy = get_policy(trans)
    np.save(values_file, values)
    np.save(policy_file, policy)

for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.plot(np.linspace(0, 1, samples), policy[:,i], color='blue')
plt.show()

print('running simulations...')

mean_cost = 0
cost_m2 = 0
for t in itertools.count():

    cost = 0

    while True:
        n = random.randrange(11)
        p = [10 * random.random() for _ in range(n)]
        i0 = 0
        f0 = 0
        done = False
        while True:
            i1 = int(policy[i0,f0])
            if i1 == 0:
                break
            g1 = g[i1]
            assert (0 <= g1 <= 9)
            cost += 2 ** g1
            f1 = sum(1 for x in p if x <= g1)
            if f1 == 10:
                done = True
                break
            if i1 == len(g) - 1:
                break
            i0 = i1
            f0 = f1
        if done:
            break

    mean_cost1 = mean_cost + (cost - mean_cost) / (t + 1.)
    cost_m2 += (cost - mean_cost) * (cost - mean_cost1)
    mean_cost = mean_cost1
    if t % 1000 == 0:
        print(f"{t} trials run, cost = {mean_cost} ± {np.sqrt(cost_m2 / (t + 1) ** 2):.3f}")

票数 3

Code Golf用户

发布于 2022-09-01 13:47:25

锈病，score= ~6,081.9

猜测每一个数字1-9。如果找到的数字小于0.8 * number，则中止。

任何人都可以自由地复制这个答案，并进一步优化参数(0.8中止限制和1.0步骤大小)，当然也可以修改其他任何内容。

use rand;
use rand::{Rng, RngCore};

struct Which {
    rng: rand::rngs::ThreadRng,
    numbers: Vec<f32>,
    score: f32
}

impl Which {
    fn new(rng: rand::rngs::ThreadRng) -> Which {
        return Which{rng: rng, numbers: vec![], score: 0.0};
    }

    fn regenerate(&mut self) {
        let number = self.rng.next_u32()%11;
        self.numbers = (0..number).map(|i|self.rng.gen::<f32>()*10.0).collect();
    }

    fn guess(&mut self, n: f32) -> usize {
        self.score += 2.0_f32.powf(n);
        self.numbers.iter().filter(|i|**i<n).count()
    }
}

fn algo(which: &mut Which) {
    which.regenerate();

    loop {
        let mut guess = 1;
        while guess<=9 {
            let number = which.guess(guess as f32);
            if number >= 10 {
                return;
            }
            if (number as f32) < (guess as f32) * 0.8 {
                break
            }
            guess+=1;
        }
        which.regenerate();
    }

}

fn main() {

    let mut total_score:f64 = 0.;
    for _ in 0..10_000 {
        let mut which = Which::new(rand::thread_rng());

        algo(&mut which);
        total_score+=which.score as f64;
    }

    println!("total score: {:?}", total_score/ 10_000.);

}

基本的初始答案，可能甚至不接近最优。

票数 1

Code Golf用户

发布于 2022-09-01 22:04:23

Javascript，平均成本= $~4975

得分就是这样的吗？我没有时间再测试5,000万次了

这不是目前为止最好的分数，但我只是想尝试一下。我确实有另外一个想法，我想尝试使用一个新的策略，但我只会张贴它，如果它表现得体：

这段代码会检查是否至少低于5点。如果有，则全部进入，检查低于9点。不管怎样，在那之后，它就放弃了。

cheater=_=>cheater()
//this crashes the program if i try to check higher than 9 :P

witch=_=>{ //witch takes no input
  let z=Math.floor(Math.random()*11)
  //selects an integer from 0 to 10
  let points=[]
  for(let i=0;i<z;i++){
    points.push(Math.random()*10)
    //plots that many points on the line from 0 to 10

  }
  return x=>x>9?cheater():points.filter(w=>w<x).length
  //and returns a function which takes n as input and outputs how many points exist below n (or crashes if n>9)
}

solver=_=>{
  let current=witch()
  //calling witch generates a new set of points, which i can check with current
  let cost=0
  while(1){
    cost+=2**5
    //i pre-emptively add the cost of each inspection on the line
    if(current(5)==10){
      return cost
    }else if(current(5)>4){
    //i think its ok to use the same inspection twice at no cost, because theres no new information, and its just easier and nicer looking than storing the result in a variable or something
      cost+=2**9
      if(current(9)==10){
        return cost
      }
    }
    current=witch()
    //this is the "give up" part
  }
}

tester=(m)=>{
  //m is the number of iterations to test
  let costlist=[]
  for(let i=0;i<m;i++){
    costlist.push(solver())
  }
  return costlist.reduce((a,b)=>a+b,0)/costlist.length
  //returns the average cost
  //i ran tester(10000000) thrice and got 4972.0177856, 4972.7681056, and 4974.2599296.
}

票数 0

页面原文内容由Code Golf提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://codegolf.stackexchange.com/questions/251528

复制

相似问题

问策略:狡猾的表亲女巫
EN

测试

输入输出

什么是天真的分数？

Notes

回答 3

Code Golf用户

Python，得分= 2610.6431 (2610.5938有400个猜测值)

Code Golf用户

锈病，score= ~6,081.9

Code Golf用户

Javascript，平均成本= $~4975

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问策略:狡猾的表亲女巫EN

测试

输入输出

什么是天真的分数？

Notes

回答 3

Code Golf用户

Python，得分= 2610.6431 (2610.5938有400个猜测值)

Code Golf用户

锈病，score= ~6,081.9

Code Golf用户

Javascript，平均成本= $~4975

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问策略:狡猾的表亲女巫
EN