文章/答案/技术大牛

发布

问NodeJS流超过堆
EN

Stack Overflow用户

提问于 2016-08-16 00:35:17

回答 1查看 1.1K关注 0票数 3

我正在尝试从一个大约400 it的csv文件中修改一些数据，并将其保存到数据库中进行本地查询。免费可用的是ip2location lite数据库，而我试图将其导入的数据库是嵌入式nedb。

require('dotenv').load()

const fs = require('fs')
const csv = require('csv-parse')
const es = require('event-stream')
const Datastore = require('nedb')
const BatchStream = require('batch-stream')

const db = new Datastore({ filename: process.env.DB_PATH, autoload: true })
const debug = require('debug')('setup')

function massage ([ipLo, ipHi, cc, country, area, city, lat, lng]) {
  return { ipLo, ipHi, cc, country, area, city, lat, lng }
}

function setup () {
  let qty = 0

  return new Promise((resolve, reject) => {
    fs.createReadStream(process.env.IP2LOCATION_PATH)
      // read and parse csv
      .pipe(csv())
      // batch it up
      .pipe(new BatchStream({ size: 100 }))
      // write it into the database
      .pipe(es.map((batch, cb) => {
        // massage and persist it
        db.insert(batch.map(massage), _ => {
          qty += batch.length
          if (qty % 100 === 0)
            debug(`Inserted ${qty} documents…`)
          cb.apply(this, arguments)
        })
      }))
      .on('end', resolve)
      .on('error', reject)
  })
}

module.exports = setup

if (!module.parent) {
  debug('Setting up geo database…')
  setup()
    .then(_ => debug('done!'))
    .catch(err => debug('there was an error :/', err))
}

在大约75000条条目之后，我得到以下错误：

<--- Last few GCs --->

   80091 ms: Mark-sweep 1372.0 (1435.0) -> 1371.7 (1435.0) MB, 1174.6 / 0 ms (+ 1.4 ms in 1 steps since start of marking, biggest step 1.4 ms) [allocation failure] [GC in old space requested].
   81108 ms: Mark-sweep 1371.7 (1435.0) -> 1371.6 (1435.0) MB, 1017.2 / 0 ms [last resort gc].
   82158 ms: Mark-sweep 1371.6 (1435.0) -> 1371.6 (1435.0) MB, 1049.9 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x4e36fec9e31 <JS Object>
    1: substr [native string.js:~320] [pc=0xdab4e7f1185] (this=0x35500e175a29 <Very long string[65537]>,Q=50,am=65487)
    2: __write [/Users/arnold/Develop/mount-meru/node_modules/csv-parse/lib/index.js:304] [pc=0xdab4e7b8f98] (this=0x350ff4f97991 <JS Object>,chars=0x35500e175a29 <Very long string[65537]>,end=0x4e36fe04299 <false>,callback=0x4e36fe04189 <undefined>)
    3: arguments adaptor fra...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/usr/local/Cellar/node/6.3.1/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/usr/local/Cellar/node/6.3.1/bin/node]
 3: v8::Utils::ReportApiFailure(char const*, char const*) [/usr/local/Cellar/node/6.3.1/bin/node]
 4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/usr/local/Cellar/node/6.3.1/bin/node]
 5: v8::internal::Factory::NewByteArray(int, v8::internal::PretenureFlag) [/usr/local/Cellar/node/6.3.1/bin/node]
 6: v8::internal::TranslationBuffer::CreateByteArray(v8::internal::Factory*) [/usr/local/Cellar/node/6.3.1/bin/node]
 7: v8::internal::LCodeGenBase::PopulateDeoptimizationData(v8::internal::Handle<v8::internal::Code>) [/usr/local/Cellar/node/6.3.1/bin/node]
 8: v8::internal::LChunk::Codegen() [/usr/local/Cellar/node/6.3.1/bin/node]
 9: v8::internal::OptimizedCompileJob::GenerateCode() [/usr/local/Cellar/node/6.3.1/bin/node]
10: v8::internal::Compiler::GetConcurrentlyOptimizedCode(v8::internal::OptimizedCompileJob*) [/usr/local/Cellar/node/6.3.1/bin/node]
11: v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions() [/usr/local/Cellar/node/6.3.1/bin/node]
12: v8::internal::StackGuard::HandleInterrupts() [/usr/local/Cellar/node/6.3.1/bin/node]
13: v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/Cellar/node/6.3.1/bin/node]
14: 0xdab4e60961b
15: 0xdab4e7f1185
16: 0xdab4e7b8f98
[1]    18102 abort      npm run setup

到底会发生什么？Stream的全部要点不是不需要同时拥有大量的数据，而是能够一片片地处理它吗？看起来这个错误是直接来自csv解析库的，对吗？

node.js

csv

stream

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-08-21 21:11:41

经过一些调试后，我发现内存链接位于我使用的第三方库中(特别是nedb)。我想，它也不是用来存储那么多文档的，所以我决定替换它。

一些我认为有用的文章找到了这个问题：

https://github.com/felixge/node-memory-leak-tutorial
https://hacks.mozilla.org/2012/11/tracking-down-memory-leaks-in-node-js-a-node-js-holiday-season/
http://blog.yld.io/2015/08/10/debugging-memory-leaks-in-node-js-a-walkthrough/

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/38964964

复制

相似问题

问NodeJS流超过堆
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NodeJS流超过堆EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NodeJS流超过堆
EN