首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >自动提取数据范围

自动提取数据范围
EN

Stack Overflow用户
提问于 2017-02-27 05:50:55
回答 1查看 467关注 0票数 1

下面的脚本,我是用来提取数据从谷歌分析。在这里,我正在提取最后一周的数据。我想使日期范围自动化,这样我就不必每周更改date_range了。我也想避免通过遗传算法对数据进行抽样。请指导我正确的方法,在详细的自动化。

作者= 'test@gmail.com (测试)‘

代码语言:javascript
复制
import argparse
import sys
import csv
import string
import datetime
import json
import time

from apiclient.errors import HttpError
from apiclient import sample_tools
from oauth2client.client import AccessTokenRefreshError

cam_name = sys.argv[1:]

class SampledDataError(Exception): pass

def main(argv):
  # Authenticate and construct service.
  service, flags = sample_tools.init(
      argv[0], 'analytics', 'v3', __doc__, __file__,
      scope='https://www.googleapis.com/analytics.readonly')

  # Try to make a request to the API. Print the results or handle errors.
  try:
    profile_id = profile_ids[profile]
    if not profile_id:
      print ('Could not find a valid profile for this user.')
    else:      
      metrics = argv[1]
      dimensions = argv[2]
      reportName = argv[3]
      sort = argv[4]
      filters = argv[5]

      for start_date, end_date in date_ranges:
        limit = ga_query(service, profile_id, 0,
                                 start_date, end_date, metrics, dimensions, sort, filters).get('totalResults')
        for pag_index in range(0, limit, 10000):
          results = ga_query(service, profile_id, pag_index,
                                     start_date, end_date, metrics, dimensions, sort, filters)
          # if results.get('containsSampledData'):

            # raise SampledDataError
          print_results(results, pag_index, start_date, end_date, reportName)

  except TypeError as error:    
    # Handle errors in constructing a query.
    print ('There was an error in constructing your query : %s' % error)

  except HttpError as error:
    # Handle API errors.
    print ('Arg, there was an API error : %s : %s' %
           (error.resp.status, error._get_reason()))

  except AccessTokenRefreshError:
    # Handle Auth errors.
    print ('The credentials have been revoked or expired, please re-run '
           'the application to re-authorize')

  except SampledDataError:
    # force an error if ever a query returns data that is sampled!
    print ('Error: Query contains sampled data!')


def ga_query(service, profile_id, pag_index, start_date, end_date, metrics, dimensions, sort, filters):

   return service.data().ga().get(
      ids='ga:' + profile_id,
      start_date=start_date,
      end_date=end_date,
      metrics=metrics,
      dimensions=dimensions,
      sort=sort,
      filters=filters,
      samplingLevel='HIGHER_PRECISION',
      start_index=str(pag_index+1),
      max_results=str(pag_index+10000)).execute()


def print_results(results, pag_index, start_date, end_date, reportName):
  """Prints out the results.

  This prints out the profile name, the column headers, and all the rows of
  data.

  Args:
    results: The response returned from the Core Reporting API.
  """

  # New write header
  if pag_index == 0:
    if (start_date, end_date) == date_ranges[0]:
      print  ('Profile Name: %s' % results.get('profileInfo').get('profileName'))
      columnHeaders = results.get('columnHeaders')
      cleanHeaders = [str(h['name']) for h in columnHeaders]
      writer.writerow(cleanHeaders)
    print (reportName,'Now pulling data from %s to %s.' %(start_date, end_date))


  # Print data table.
  if results.get('rows', []):
    for row in results.get('rows'):
      for i in range(len(row)):
        old, new = row[i], str()
        for s in old:
          new += s if s in string.printable else ''
        row[i] = new
      writer.writerow(row)

  else:
    print ('No Rows Found')

  limit = results.get('totalResults')
  print (pag_index, 'of about', int(round(limit, -4)), 'rows.')
  return None

# Uncomment this line & replace with 'profile name': 'id' to query a single profile
# Delete or comment out this line to loop over multiple profiles.

#Brands

profile_ids = {'abc-Mobile': '12345',
                'abc-Desktop': '23456',
                'pqr-Mobile': '34567',
                'pqr-Desktop': '45678',
                'xyz-Mobile': '56789',
                'xyz-Desktop': '67890'}

date_ranges = [
('2017-01-24','2017-01-24'),
('2017-01-25','2017-01-25'),
('2017-01-26','2017-01-26'),
('2017-01-27','2017-01-27'),
('2017-01-28','2017-01-28'),
('2017-01-29','2017-01-29'),
('2017-01-30','2017-01-30')
]

for profile in sorted(profile_ids):
  print("Sequence 1",profile)
  with open('qwerty.json') as json_data:
    d = json.load(json_data)
    for getThisReport in d["Reports"]:
      print("Sequence 2",getThisReport["ReportName"])
      reportName = getThisReport["ReportName"]
      metrics = getThisReport["Metrics"]
      dimensions = getThisReport["Dimensions"]
      sort = getThisReport["sort"]
      filters = getThisReport["filter"]

      path = 'C:\\Projects\\DataExport\\test\\' #replace with path to your folder where csv file with data will be written

      today = time.strftime('%Y%m%d')

      filename = profile+'_'+reportName+'_'+today+'.csv' #replace with your filename. Note %s is a placeholder variable and the profile name you specified on row 162 will be written here
      with open(path + filename, 'wt') as f:
        writer = csv.writer(f,delimiter = '|', lineterminator='\n', quoting=csv.QUOTE_MINIMAL)
        args = [sys.argv,metrics,dimensions,reportName,sort,filters]
        if __name__ == '__main__': main(args)
      print ( "Profile done. Next profile...")

print ("All profiles done.")
EN

回答 1

Stack Overflow用户

发布于 2017-02-27 08:04:48

就日期而言,核心报告API支持一些有趣的东西。

所有Analytics数据请求都必须指定日期范围。如果请求中不包括“开始日期”和“结束日期”参数,则服务器将返回一个错误。日期值可以是特定日期的日期,方法是使用模式YYYY DD或使用今天、昨天或NdaysAgo模式的相对日期。值必须与0-9{4}-0-9{2}-0-9{2}|today|yesterday|0-9+(daysAgo).匹配

所以做一些类似的事情

代码语言:javascript
复制
start_date = '7daysAgo' 
end_date   = 'today'

请记住,数据已经24到48小时没有完成处理,所以您今天、昨天和前一天的数据可能不能100%准确。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/42478655

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档