如何使用 gmail api 保存按月发送的邮件并根据时间划分邮件并将输出保存为 csv 或将其转换为 df?

How to save month-wise mails and aslo divide mails based on time using gmail api and save the output to csv or convert it into df?

下面的代码让我们从发送消息的时间段开始每 30 天获取消息计数。

这段代码让我们:(详细)

1.Amazon第一封邮件到我的邮箱,有特定的阶段(这里是第一顺序)。

2.Convert 将 epoch 格式化为时间日期并使用 timedelta 并获取 30 天间隔内发送的邮件数。

此代码的输出将如下所示:

Amazon first order:

1534476682000

Amazon total orders between 2018-08-01 and 2018-09-01: 20

Amazon total orders between 2018-09-01 and 2018-10-01: 11

Amazon total orders between 2018-10-01 and 2018-11-01: 15

Amazon total orders between 2018-11-01 and 2018-12-01: 7

Amazon total orders between 2018-12-01 and 2019-01-01: 19

Amazon total orders between 2019-01-01 and 2019-02-01: 23

Amazon total orders between 2019-02-01 and 2019-03-01: 12

代码:

#amazonfirstorder
from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
from dateutil.relativedelta import relativedelta
from datetime import datetime


SCOPES = 'https://www.googleapis.com/auth/gmail.readonly'

def main():

    store = file.Storage('token.json')
    creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('credentials.json', SCOPES)
    creds = tools.run_flow(flow, store)
service = build('gmail', 'v1', http=creds.authorize(Http()))

results = service.users().messages().list(userId='me', q='from:auto-confirm@amazon.in subject:(your amazon.in order of )',labelIds = ['INBOX']).execute()

messages = results.get('messages', [])


print('\nFilpkart first order:')
if not messages:
    print (" ")
else:
    print (" ")

    msg = service.users().messages().get(userId='me', id=messages[-1]['id']).execute()
    #print(msg['snippet'])
    a=(msg['internalDate'])
    ts = int(a)
    ts /= 1000
    year=int(datetime.utcfromtimestamp(ts).strftime('%Y'))
    month=int(datetime.utcfromtimestamp(ts).strftime('%m'))
    #print(year)
    #print(month)
    print(msg['internalDate'])

    log_results = []
    start_date = datetime(year,month,1)
#start_date = datetime(2016,1,1)
    end_date = datetime.today()
    increment = relativedelta(months=1)
    target_date = start_date + increment

    while target_date <= end_date:

        timestamp_after = int(start_date.timestamp())  # timestamp of start day
        timestamp_before = int(target_date.timestamp())  # timestamp of start day + 30 days

        query = f'from:(auto-confirm@amazon.in) subject:(your amazon.in order of ) after:{timestamp_after} before:{timestamp_before}'
        results = service.users().messages().list(userId='me', q=query, labelIds=['INBOX']).execute()

        messages = results.get('messages', [])
        orders = len(messages)
        start_date_str = start_date.strftime('%Y-%m-%d')
        target_date_str = target_date.strftime('%Y-%m-%d')
        print(f"\nFlipkart total orders between {start_date.strftime('%Y-%m-%d')} and {target_date.strftime('%Y-%m-%d')}: {orders}")

        log_results.append(dict(start=start_date_str, end=target_date_str, orders=orders))

    # update interval
        start_date += increment
        target_date += increment

    return log_results



if __name__ == '__main__':
    log_results = main()    

现在我有两个问题:

第一

如何将该代码的输出保存到 csv 文件中。

第二:

上面的代码为我们提供了 30 天的邮件计数,我需要的是我需要按月在中午 12 点之前和按月在中午 12 点之后收到的邮件计数,并将它们保存在 csv 中。

我需要第二个问题的输出 :

Amazon total orders between 2018-09-01 and 2018-10-01 before 12:00 PM : 11

Amazon total orders between 2018-10-01 and 2018-11-01 before 12:00 PM : 15

Amazon total orders between 2018-11-01 and 2018-12-01 before 12:00 PM : 7

Amazon total orders between 2018-12-01 and 2019-01-01 before 12:00 PM : 19

Amazon total orders between 2018-09-01 and 2018-10-01 after 12:00 PM : 3

Amazon total orders between 2018-10-01 and 2018-11-01 after 12:00 PM : 6

Amazon total orders between 2018-11-01 and 2018-12-01 after 12:00 PM : 88

Amazon total orders between 2018-12-01 and 2019-01-01 after 12:00 PM : 26

您只需要以您想要的间隔遍历日期。

下面的代码检索特定时间段内用户的消息,例如当月的消息数。

您需要帮助才能自动检索每 30 天的邮件计数。

例如,此代码获取从 2016 年 1 月 1 日到 2016 年 1 月 30 日的消息。

因此,从 2016 年 1 月 1 日到 2019 年 1 月 1 日,您需要每隔 30 天定期对其进行自动化。

from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
import time
from dateutil.relativedelta import relativedelta
from datetime import datetime

SCOPES = 'https://www.googleapis.com/auth/gmail.readonly'    
def main():
    store = file.Storage('token.json')
    creds = store.get() 
    if not creds or creds.invalid:
        flow = client.flow_from_clientsecrets('credentials.json', SCOPES)
        creds = tools.run_flow(flow, store)
    service = build('gmail', 'v1', http=creds.authorize(Http()))
    end_date = datetime(2019, 1, 1)
    interval = relativedelta(months=1)
    current = datetime(2016, 1, 1)              # init to the start date 
    while current < end_date + interval:
         after = current.timestamp()
         before = (current + interval).timestamp()

         query = 'from:(auto-confirm@amazon.in) subject:(your amazon.in order of ) after:{} before:{}'.format(after, before)
         results = service.users().messages().list(userId='me', q=query, labelIds = ['INBOX']).execute()

         messages = results.get('messages', [])
         print("\namazon total orders in {}: {}".format(current.strftime('%B %Y'), len(messages)))
         current += interval    

if __name__ == '__main__':
    main()  

与已经提议的类似,但在这种情况下,您将精确计算增量为一个月而不是 30 天(请参阅 relativedelta 而不是 timedelta 的用法):

from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
from dateutil.relativedelta import relativedelta
from datetime import datetime

SCOPES = 'https://www.googleapis.com/auth/gmail.readonly'

def main():

    store = file.Storage('token.json')
    creds = store.get()
    if not creds or creds.invalid:
        flow = client.flow_from_clientsecrets('credentials.json', SCOPES)
        creds = tools.run_flow(flow, store)
    service = build('gmail', 'v1', http=creds.authorize(Http()))

    log_results = []

    start_date = datetime(2016, 1, 1)
    end_date = datetime.today()
    increment = relativedelta(months=1)
    target_date = start_date + increment

    while target_date <= end_date:

        timestamp_after = int(start_date.timestamp())  # timestamp of start day
        timestamp_before = int(target_date.timestamp())  # timestamp of start day + 30 days

        query = f'from:(auto-confirm@amazon.in) subject:(your amazon.in order of ) after:{timestamp_after} before:{timestamp_before}'
        results = service.users().messages().list(userId='me', q=query, labelIds=['INBOX']).execute()

        messages = results.get('messages', [])
        orders = len(messages)
        start_date_str = start_date.strftime('%Y-%m-%d')
        target_date_str = target_date.strftime('%Y-%m-%d')
        print(f"\nAmazon total orders between {start_date.strftime('%Y-%m-%d')} and {target_date.strftime('%Y-%m-%d')}: {orders}")

        log_results.append(dict(start=start_date_str, end=target_date_str, orders=orders))

        # update interval
        start_date += increment
        target_date += increment

    return log_results



if __name__ == '__main__':
    log_results = main()
    # Write to csv
    import pandas as pd
    df = pd.DataFrame(log_results)
    df.to_csv('orders.csv')