文章/答案/技术大牛

发布

社区首页 >问答首页 >Pytorch运行时错误: Cuda内存不足。使用jupyter笔记本很好，但不是脚本

问Pytorch运行时错误: Cuda内存不足。使用jupyter笔记本很好，但不是脚本
EN

Stack Overflow用户

提问于 2020-05-21 22:17:11

回答 1查看 1.5K关注 0票数 2

我有一个特殊的问题。我能够在jupyter笔记本上运行代码，没有OOM错误。但是，当我在linux中运行与脚本相同的代码时，它会给出OOM错误。有没有人有同样的问题。我在代码中尝试了gc.collect()和torch.cuda.empty_cache()，但是没有任何帮助。

它总是给我带来这个错误。RuntimeError:库达没记忆了。尝试分配1.30 GiB (GPU 0；7.79 GiB总容量；4.80 GiB已分配；922.69 MiB空闲；PyTorch总共保留6.12 GiB )

守则：

def lemmatize(phrase):
    """Return lematized words"""
    spa = spacy.load("en_core_web_sm")
    return " ".join([word.lemma_ for word in spa(phrase)])

def reading_csv(path_to_csv):
    """Return text column in csv"""
    data = pd.read_csv(path_to_csv)
    ctx_paragraph = []
    for txt in data['text']:
        if not pd.isna(txt):
            ctx_paragraph.append(txt)
    return ctx_paragraph

def processing_question(ques, paragraphs, domain_lemma_cache, domain_pickle):
    """Return answer"""
    #Lemmatizing whole csv text column
    lemma_cache = domain_lemma_cache
    if not os.path.isfile(lemma_cache):
        lemmas = [lemmatize(par) for par in tqdm(paragraphs)]
        df = pd.DataFrame(data={'context': paragraphs, 'lemmas': lemmas})
        df.to_feather(lemma_cache)
    df = pd.read_feather(lemma_cache)
    paragraphs = df.context
    lemmas = df.lemmas
    #Vectorizor cache
    if not os.path.isfile(VEC_PICKLE_LOC):
        vectorizer = TfidfVectorizer(
            stop_words='english', min_df=5, max_df=.5, ngram_range=(1, 3))
        vectorizer.fit_transform(lemmas)
        pickle.dump(vectorizer, open(VEC_PICKLE_LOC, "wb"))
    #Vectorized lemmas cache cache
    if not os.path.isfile(domain_pickle):
        tfidf = vectorizer.fit_transform(lemmas)
        pickle.dump(tfidf, open(domain_pickle, "wb"))
    vectorizer = pickle.load(open(VEC_PICKLE_LOC, "rb"))
    tfidf = pickle.load(open(domain_pickle, "rb"))
    question = ques
    query = vectorizer.transform([lemmatize(question)])
    (query > 0).sum(), vectorizer.inverse_transform(query)
    scores = (tfidf * query.T).toarray()
    results = (np.flip(np.argsort(scores, axis=0)))
    qapipe = pipeline('question-answering',
                      model='distilbert-base-uncased-distilled-squad',
                      tokenizer='bert-base-uncased',
                      device=0)
    candidate_idxs = [(i, scores[i]) for i in results[0:10, 0]]
    contexts = [(paragraphs[i], s) for (i, s) in candidate_idxs if s > 0.01]
    question_df = pd.DataFrame.from_records([{
        'question': question,
        'context':  ctx
    } for (ctx, s) in contexts])
    preds = qapipe(question_df.to_dict(orient="records"))
    answer_df = pd.DataFrame.from_records(preds)
    answer_df["context"] = question_df["context"]
    answer_df = answer_df.sort_values(by="score", ascending=False)
    return answer_df

python

out-of-memory

pytorch

gpu

回答 1

Stack Overflow用户

发布于 2021-05-25 00:16:40

最近我也发生了类似的事情。

我会在一个AWS EC2 p2.xlarge实例上在木星笔记本上运行我的模型，该模型将正确运行。然后，我将ssh放到同一个实例中，并重新运行同一个模型的.py脚本，并接收您描述的OOM错误。

我所要做的就是重新设置木星笔记本的核心，以使.py脚本工作。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61944703

复制

相似问题

问Pytorch运行时错误: Cuda内存不足。使用jupyter笔记本很好，但不是脚本
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Pytorch运行时错误: Cuda内存不足。使用jupyter笔记本很好，但不是脚本EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Pytorch运行时错误: Cuda内存不足。使用jupyter笔记本很好，但不是脚本
EN