我的朋友在一次面试中被问到这个问题,我们都不知道如何解决这个问题。会很感激你的帮助!
问题:
我有两个想合并在一起的数据文件:
Dataframe 1:

Dataframe 2:

我的思维过程是:
中创建的“时间桶”列上合并了这些数据格式。
在一次面试中,我不知道最快的方法是什么。
发布于 2021-02-20 16:36:41
我不认为你可以merge这两个数据格式。在经过一些预处理之后,您可能需要join。
import pandas as pd
dict1 = {
"A": "9 AM",
"B": "21:00",
"C": "3 PM",
"D": "15:00"
}
timings = ["3-5 PM", "5-7 PM", "7-9 PM", "9-11 PM"]
def convertFrom24HourTo12HourFormat(value):
"""
function to convert 24 hour date format to 12 hour date format
"""
# assuming all data is valid
split = value.split(":")
if len(split) == 1:
return value
hours = int(split[0])
if hours > 12:
return "{} {}".format(hours - 12, "PM")
return "{} {}".format(hours, "AM")
def mapTimingToTimings(timingList, timingsList):
"""
janky function to find out the mapping
"""
resultTimings = []
# for each timing
for timing in timingList:
# find out the range in which it falls
split = timing.split(" ")
hour = split[0]
meridian = split[1]
result = None
for timings in timingsList:
if timings.startswith(hour) and timings.find(meridian) != -1:
result = timings
resultTimings.append(result)
# return it
return resultTimings
for key, value in dict1.items():
dict1[key] = convertFrom24HourTo12HourFormat(value)
df = pd.DataFrame(list(dict1.items()), columns = ["Flight Name", "Flight Timing"])
properTimings = mapTimingToTimings(df["Flight Timing"].values, timings)
df.join(pd.Series(properTimings, name = "Flight Timings"))输出:

我相信其他人可以建议更好的优化,但考虑到这一点在一次采访中被问到,在这种情况下想出优化是很困难的。
https://stackoverflow.com/questions/66293519
复制相似问题