我有以下脚本可以用雅虎财务信息填充我的数据库:
from multiprocessing import Pool
import json, time, yfinance
import django
django.setup()
from dividends_info.functions.stock_info import save_stock_info_data
from dividends_info.models import StockInfo
with open('tickers/nyse_tickers.json') as tickers_file:
TICKERS = json.load(tickers_file)
TICKERS = TICKERS[0:100]
# https://www.digitalocean.com/community/tutorials/python-multiprocessing-example
def update_a_stock(ticker):
stock, created = StockInfo.objects.get_or_create(ticker=ticker)
yahoo_stock_obj = yfinance.Ticker(ticker.upper())
earnings_history = yahoo_stock_obj.earnings_history
save_stock_info_data(yahoo_stock_obj, ticker, stock, earnings_history)
if not stock.earnings:
print(f"No earnings for {ticker}")
def pool_handler():
start = time.time()
p = Pool(2)
p.map(update_a_stock, TICKERS)
with open("time_taken_to_populate.txt", "w") as time_file:
time_taken = round((time.time() - start), 2)
time_file.write(f"Time taken = {time_taken:.10f}")
if __name__ == '__main__':
pool_handler()重要的是
yahoo_stock_obj = yfinance.Ticker(ticker.upper())
earnings_history = yahoo_stock_obj.earnings_history而我的save_stock_info_data则打印出收益历史记录的类型:
ticker in save stock info func: ABEV
<class 'pandas.core.frame.DataFrame'>
ticker in save stock info func: A
<class 'pandas.core.frame.DataFrame'>
Saved new information for stock ABEV
Saved new information for stock A
ticker in save stock info func: ABG
<class 'pandas.core.frame.DataFrame'>
ticker in save stock info func: AA
<class 'pandas.core.frame.DataFrame'>
Saved new information for stock ABG
Saved new information for stock AA
ticker in save stock info func: ABM
<class 'pandas.core.frame.DataFrame'>如您所见,脚本开头很好,对于前13个调用,它将保存收益数据:
python3 count_db_items.py
21 many stocks
name: 18, summary: 18, dividends: 16, earnings: 13保存前13次收益后,earnings_history不再从api中获得:
ticker in save stock info func: AB
<class 'NoneType'>
Could not find data for ACCO.
ticker in save stock info func: ACCO
<class 'NoneType'>
Saved new information for stock AB
No earnings for AB
Saved new information for stock ACCO
No earnings for ACCO
Could not find data for ABB.
ticker in save stock info func: ABB
<class 'NoneType'>从yfinance的earnings_history中收集的每个yahoo_stock_object对象都是'None‘。每次脚本在保存前13之后运行时,都会发生这种情况。包括股息在内的所有其他数据都是可用的。我认为股息是作为数组发送的,而earnings_history是作为熊猫数据发送的。
当我运行所有股票的脚本时,我有1800只股票,1700股股息,只有25只盈利结果。
同步运行脚本没有帮助,实际上,如果根本没有保存收入,情况就更糟了:
import json, os, sys, time, yfinance
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "dividends_project.settings")
sys.path.append('../..')
import django
django.setup()
from dividends_info.functions.stock_info import save_stock_info_data
from dividends_info.models import StockInfo
# https://stackoverflow.com/questions/59159991/modulenotfounderror-no-module-named-foo-how-can-i-import-a-model-into-a-djang
# https://pythonspeed.com/articles/python-multiprocessing/
# https://github.com/pytorch/pytorch/issues/3492
f = open('tickers/nyse_tickers.json')
TICKERS = json.load(f)
f.close()
print(len(TICKERS))
TICKERS = TICKERS[:50]
def update_a_stock(ticker):
stock, created = StockInfo.objects.get_or_create(ticker=ticker)
yahoo_stock_obj = yfinance.Ticker(ticker.upper())
earnings_history = yahoo_stock_obj.earnings_history
# stock, yahoo_obj = save_stock_info_data(ticker, stock)
save_stock_info_data(yahoo_stock_obj, ticker, stock, earnings_history)
if not stock.earnings:
print("earnings didn't save for this stock...the earnings are:")
print(earnings_history)
def run():
start = time.time()
for ticker in TICKERS:
update_a_stock(ticker)
with open("time_taken_to_populate.txt", "w") as time_file:
time_taken = round((time.time() - start), 2)
time_file.write(f"Time taken = {time_taken:.10f}")
if __name__ == "__main__":
run()没有收益数据,我的网站几乎是无用的。因为运行脚本的同步版本也不起作用,这似乎不是一个限制速率的问题。
如果我访问一只股票,它就有收益:
import sys
ticker = sys.argv[1]
def update_a_stock(ticker):
stock, created = StockInfo.objects.get_or_create(ticker=ticker)
yahoo_stock_obj = yfinance.Ticker(ticker.upper())
earnings_history = yahoo_stock_obj.earnings_history
earnings = gather_earnings_object在ipdb中:
ipdb> earnings
[{'date': datetime.date(2022, 7, 29), 'expected': 3.31, 'actual': 3.37, 'surprise': '+1.69'}, {'date': datetime.date(2022, 4, 29), 'expected': 3.14, 'actual': 3.16, 'surprise': '+0.57'}, {'date': datetime.date(2022, 2, 2), 'expected': 3.29, 'actual': 3.31, 'surprise': '+0.73'}, {'date': datetime.date(2021, 10, 29), 'expected': 3.22, 'actual': 3.33, 'surprise': '+3.32'}, ....我必须为单个api调用手动运行脚本10000次,以填充所有美国股票的数据。
如何运行调用yfinance的异步脚本来填充我的db?
发布于 2022-11-20 13:11:26
这就是你要的。
import pandas_datareader as web
import pandas as pd
df = web.DataReader('AAPL', data_source='yahoo', start='2011-01-01', end='2021-01-12')
df.head()
import yfinance as yf
aapl = yf.Ticker("AAPL")
aapl
# show earnings
aapl.earnings
aapl.quarterly_earnings结果:
Revenue Earnings
Quarter
4Q2021 123945000000 34630000000
1Q2022 97278000000 25010000000
2Q2022 82959000000 19442000000
3Q2022 90146000000 20721000000熊猫数据中心有各种各样的东西可供使用。
https://pandas-datareader.readthedocs.io/en/latest/readers/yahoo.html
https://stackoverflow.com/questions/73724165
复制相似问题