当在文件上映射zipmap时,如何忽略空行?
(defn csv-data->maps [csv-data]
(map zipmap
(->> (first csv-data) ;; First row is the header
repeat)
(rest csv-data)))发布于 2020-03-31 22:12:11
最简单的方法是重用现有图书馆。
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[clojure.string :as str]
[schema.core :as s]
[tupelo.csv :as csv]))
(s/defn remove-blank-lines :- s/Str
"Accepts a multi-line text string, and returns one with any blank lines removed."
[text-str :- s/Str]
(let [text-lines (str/split-lines text-str)
lines-no-blanks (remove str/blank? text-lines)
text-no-blanks (str/join \newline lines-no-blanks)]
text-no-blanks))
(dotest
(let [csv-text "zip-postal-code,store-num,chain-rank
01002,00006,4
01002,00277,5
01003,00277,5
01008,01217,5
01009,00439,5
01020,01193,5"
csv-text-no-blanks (remove-blank-lines csv-text)
csv-entities (csv/parse->entities csv-text-no-blanks)
csv-attrs (csv/entities->attrs csv-entities)]
(is= csv-entities
[{:zip-postal-code "01002", :store-num "00006", :chain-rank "4"}
{:zip-postal-code "01002", :store-num "00277", :chain-rank "5"}
{:zip-postal-code "01003", :store-num "00277", :chain-rank "5"}
{:zip-postal-code "01008", :store-num "01217", :chain-rank "5"}
{:zip-postal-code "01009", :store-num "00439", :chain-rank "5"}
{:zip-postal-code "01020", :store-num "01193", :chain-rank "5"}])如示例所示,您可以获得面向行(实体映射)或面向列(属性向量)的CSV数据。
(is= csv-attrs
{:store-num ["00006" "00277" "00277" "01217" "00439" "01193"],
:zip-postal-code ["01002" "01002" "01003" "01008" "01009" "01020"],
:chain-rank ["4" "5" "5" "5" "5" "5"]})
))有关这里的医生库,请参见tupelo.csv。
的另一种方式是(也许更简单)是使用sed这样的简单Unix工具对文件进行预处理。考虑一个示例文件:
~/expr/demo > cat csv.txt
zip-postal-code,store-num,chain-rank
01002,00006,4
01002,00277,5
01003,00277,5
01008,01217,5
01009,00439,5
01020,01193,5并使用sed (Stream EDitor)处理它:
~/expr/demo > sed '/^ *$/d' csv.txt
zip-postal-code,store-num,chain-rank
01002,00006,4
01002,00277,5
01003,00277,5
01008,01217,5
01009,00439,5
01020,01193,5维奥拉!
https://stackoverflow.com/questions/60960476
复制相似问题