首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >AnalysisException Py4JJavaError on transformation ()

AnalysisException Py4JJavaError on transformation ()
EN

Stack Overflow用户
提问于 2021-03-21 06:33:19
回答 1查看 327关注 0票数 0

使用withColumn()命令来对dataframe进行一些基本的转换,即更新列的值。在寻找一些调试帮助的同时,我也分析问题。

Pyspark正在对AnalysisException命令的使用发布pyspark.withColumn & Py4JJavaError。

_c49=“事件_叙事”是withColumn(‘EVENT_NARRATIVE’).火花df (dataframe)中的引用数据元素。

代码语言:javascript
复制
from pyspark.sql.functions import *
from pyspark.sql.types import *
代码语言:javascript
复制
df = df.withColumn('EVENT_NARRATIVE', lower(col('EVENT_NARRATIVE')))
代码语言:javascript
复制
Py4JJavaError: An error occurred while calling o100.withColumn.
: org.apache.spark.sql.AnalysisException: cannot resolve '`EVENT_NARRATIVE`' given input columns: [_c3, _c17, _c40, _c21, _c48, _c12, _c39, _c18, _c31, _c10, _c45, _c26, _c5, _c43, _c24, _c33, _c9, _c14, _c1, _c16, _c47, _c20, _c46, _c32, _c22, _c7, _c2, _c42, _c37, _c36, _c30, _c8, _c38, _c23, _c25, _c13, _c29, _c41, _c19, _c44, _c11, _c28, _c6, _c50, _c49, _c0, _c15, _c4, _c34, _c27, _c35];;
'Project [_c0#604, _c1#605, _c2#606, _c3#607, _c4#608, _c5#609, _c6#610, _c7#611, _c8#612, _c9#613, _c10#614, _c11#615, _c12#616, _c13#617, _c14#618, _c15#619, _c16#620, _c17#621, _c18#622, _c19#623, _c20#624, _c21#625, _c22#626, _c23#627, ... 28 more fields]
+- Relation[_c0#604,_c1#605,_c2#606,_c3#607,_c4#608,_c5#609,_c6#610,_c7#611,_c8#612,_c9#613,_c10#614,_c11#615,_c12#616,_c13#617,_c14#618,_c15#619,_c16#620,_c17#621,_c18#622,_c19#623,_c20#624,_c21#625,_c22#626,_c23#627,... 27 more fields] csv

来自df.head()的1行示例数据:

代码语言:javascript
复制
[Row(_c0='BEGIN_YEARMONTH', _c1='BEGIN_DAY', _c2='BEGIN_TIME', _c3='END_YEARMONTH', _c4='END_DAY', _c5='END_TIME', _c6='EPISODE_ID', _c7='EVENT_ID', _c8='STATE', _c9='STATE_FIPS', _c10='YEAR', _c11='MONTH_NAME', _c12='EVENT_TYPE', _c13='CZ_TYPE', _c14='CZ_FIPS', _c15='CZ_NAME', _c16='WFO', _c17='BEGIN_DATE_TIME', _c18='CZ_TIMEZONE', _c19='END_DATE_TIME', _c20='INJURIES_DIRECT', _c21='INJURIES_INDIRECT', _c22='DEATHS_DIRECT', _c23='DEATHS_INDIRECT', _c24='DAMAGE_PROPERTY', _c25='DAMAGE_CROPS', _c26='SOURCE', _c27='MAGNITUDE', _c28='MAGNITUDE_TYPE', _c29='FLOOD_CAUSE', _c30='CATEGORY', _c31='TOR_F_SCALE', _c32='TOR_LENGTH', _c33='TOR_WIDTH', _c34='TOR_OTHER_WFO', _c35='TOR_OTHER_CZ_STATE', _c36='TOR_OTHER_CZ_FIPS', _c37='TOR_OTHER_CZ_NAME', _c38='BEGIN_RANGE', _c39='BEGIN_AZIMUTH', _c40='BEGIN_LOCATION', _c41='END_RANGE', _c42='END_AZIMUTH', _c43='END_LOCATION', _c44='BEGIN_LAT', _c45='BEGIN_LON', _c46='END_LAT', _c47='END_LON', _c48='EPISODE_NARRATIVE', _c49='EVENT_NARRATIVE', _c50='DATA_SOURCE'),
 Row(_c0='201210', _c1='29', _c2='1600', _c3='201210', _c4='29', _c5='1922', _c6='68680', _c7='416744', _c8='NEW HAMPSHIRE', _c9='33', _c10='2012', _c11='October', _c12='High Wind', _c13='Z', _c14='12', _c15='EASTERN HILLSBOROUGH', _c16='BOX', _c17='29-OCT-12 16:00:00', _c18='EST-5', _c19='29-OCT-12 19:22:00', _c20='0', _c21='0', _c22='0', _c23='0', _c24='109.60K', _c25='0.00K', _c26='ASOS', _c27='55.00', _c28='MG', _c29=None, _c30=None, _c31=None, _c32=None, _c33=None, _c34=None, _c35=None, _c36=None, _c37=None, _c38=None, _c39=None, _c40=None, _c41=None, _c42=None, _c43=None, _c44=None, _c45=None, _c46=None, _c47=None, _c48='Sandy, a hybrid storm with both tropical and extra-tropical characteristics, brought high winds and coastal flooding to southern New England.  Easterly winds gusted to 50 to 60 mph for interior southern New England; 55 to 65 mph along the eastern Massachusetts coast and along the I-95 corridor in southeast Massachusetts and Rhode Island; and 70 to 80 mph along the southeast Massachusetts and Rhode Island coasts.  A few higher higher gusts occurred along the Rhode Island coast.  A severe thunderstorm embedded in an outer band associated with Sandy produced wind gusts to 90 mph and concentrated damage in Wareham early Tuesday evening, |a day after the center of Sandy had moved into New Jersey.  In general, moderate coastal flooding occurred along the Massachusetts coastline, and major coastal flooding impacted the Rhode Island coastline.  The storm surge was generally 2.5 to 4.5 feet along the east coast of Massachusetts, but peaked late Monday afternoon in between high tide cycles.  Seas built to between 20 and 25 feet Monday afternoon and evening just off the Massachusetts east coast.  Along the south coast, the storm surge was 4 to 6 feet and seas from 30 to a little over 35 feet were observed in the outer coastal waters.  The very large waves on top of the storm surge caused destructive coastal flooding along stretches of the Rhode Island exposed south coast.  ||Sandy grew into a hurricane over the southwest Caribbean and then headed north across Jamaica, Cuba, and the Bahamas.  As Sandy headed north of the Bahamas, the storm interacted with a vigorous weather system moving west to east across the United States and began to take on a hybrid structure.  Strong high pressure over southeast Canada helped with the expansion of the strong winds well north of the center of Sandy.  In essence, Sandy retained the structure of a hurricane near its center (until shortly before landfall) while taking on more of an extra-tropical cyclone configuration well away from the center.  Sandy���s track was unusual.  The storm headed northeast and then north across the western Atlantic and then sharply turned to the west to make landfall near Atlantic City, NJ during Monday evening.  Sandy subsequently weakened and moved west across southern Pennsylvania on Tuesday before turning north and heading across western New York state into Quebec during Tuesday night and Wednesday.', _c49='The Automated Surface Observing System at Manchester-Boston Regional Airport (KMHT) recorded sustained wind speeds of 38 mph and gusts to 63 mph.  In Manchester, a tree was downed on Harrison Street.  In Hudson, a tree was downed on Lawrence Road, bringing down wires that sparked a fire that damaged a house.  In Merrimack, a tree was downed, taking down wires and closing Amherst Road from Meetinghouse Road to Riverside Drive.  In Nashua, a tree was downed onto a house on Broad Street, near the Hollils line.  No structural damage was found.  Numerous trees were downed, blocking roads.', _c50='CSV')
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-03-21 06:53:01

列名的形式是_c,后面跟着数字,因为假设您在读取输入文件时没有指定header=True。你能做到的

代码语言:javascript
复制
df = spark.read.csv('filepath', header=True)

所以列名将是BEGIN_YEARMONTHBEGIN_DAY,.等等,而不是_c0_c1、.,然后您的withColumn代码就可以工作了。

还可以考虑添加inferSchema=True以确保数据类型是合适的。

当然,您也可以继续使用当前代码,并执行以下操作

代码语言:javascript
复制
df2 = df.withColumn('_c49', lower(col('_c49')))

但这不是一个好的长期解决方案。列名应该是合理的,而且您也不希望标题成为数据names中的行之一。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66729488

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档