Automating Google Trend Analysis

tcanengin
Jan 24, 2025
5 min read

Updated: Jan 29, 2025

Automating Trend Analysis with Python, Linux, Ansible, and Azure DevOps

Gitlab:https://gitlab.com/tcengineer/automating-google-trend-analysis

In today’s fast-paced digital landscape, staying ahead of trends is crucial for businesses, creators, and analysts. In this blog, I’ll walk you through a project I built to automate the process of collecting trending topics from Google Trends. The system leverages a combination of powerful tools and technologies to ensure efficiency, scalability, and reliability.

The purpose of this project is to create a fully automated system for collecting and analyzing trending topics from Google Trends, enabling real-time insights into popular search trends. By combining Python for data collection, Linux cron jobs for scheduling, Ansible for deployment automation, and Azure DevOps pipelines for continuous integration and delivery, this project streamlines the entire workflow, from script execution to deployment.

Project Overview:

Created an automation process to fetch Google Trends data for specific keywords and locations.
The script fetches trending topics hourly and logs the data, with plans to store it for analysis or further processing.
Uused Python, Ansible, Azure DevOps Pipeline, and Linux to automate and deploy the solution.

Python Script:

Developed a Python script (google_trends.py) that:
- Fetches trending topics from Google Trends using their API.
- Filters the data for specific regions (like the Netherlands) and topics (e.g., "Djokovic," "Ajax").
- Logs the trending topics to a file for future reference or processing.
The script is designed to run at regular intervals, fetching the latest trending data.

google_trends.py

import warnings
import time

import logging

from pytrends.exceptions import TooManyRequestsError

from pytrends.request import TrendReq

from flask import Flask, render_template

from apscheduler.schedulers.background import BackgroundScheduler



# Suppress the specific FutureWarning from pytrends

warnings.simplefilter(action='ignore', category=FutureWarning)



# Initialize pytrends

pytrends = TrendReq(hl='en-US', tz=360)



# Initialize Flask app

app = Flask(__name__)



# Global variable to store trends data

trends_data = []



# Set up logging

logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(message)s')



# Function to get Google Trends data for Netherlands with retries and delays

# In the get_trends function

# In the get_trends function

def get_trends():

    global trends_data

    try:

        logging.info("Fetching Google Trends data for Netherlands...")



        # Fetch trending data from Google Trends for Netherlands

        trending_data = pytrends.trending_searches(pn='netherlands')



        # Log the raw response to see if we have the expected data

        logging.info(f"Raw Trends Data (head): {trending_data.head()}")  # Log the first few entries

        logging.info(f"Raw Trends Data (full): {trending_data}")  # Log the entire data



        if trending_data.empty:

            logging.warning("No data returned from Google Trends.")

        else:

            # Get the top 10 trending topics

            top_trends = trending_data.head(10).values.flatten().tolist()



            trend_info = []

            for trend in top_trends:

                trend_info.append({'trend': trend})



            trends_data = trend_info  # Update global trends_data

            logging.info(f"Fetched {len(trends_data)} trends for Netherlands.")



    except TooManyRequestsError:

        logging.error("Too many requests to Google Trends. Retrying after a delay.")

        time.sleep(60)

        get_trends()

    except Exception as e:

        logging.error(f"An error occurred while fetching trends: {e}")

        time.sleep(10)

        get_trends()



# Route to display trends on a webpage

@app.route('/')

def home():

    # Log the trends data to check if it is being populated

    logging.info(f"Trends Data: {trends_data}")

    # Pass the trends data to the HTML template

    return render_template('index.html', trends=trends_data)





# Function to schedule the job (fetch trends every hour)

def schedule_job():

    scheduler = BackgroundScheduler()

    scheduler.add_job(get_trends, 'interval', hours=1)  # Run every hour

    scheduler.start()





if __name__ == "__main__":

    # Fetch trends initially when the app starts

    get_trends()



    # Schedule the task to fetch trends every hour

    schedule_job()



    # Start the Flask app

    app.run(debug=True, host='0.0.0.0', port=5003)

Automation with Cron:

To make the script run hourly, we:
- Used cron (a Linux task scheduler) to automatically trigger the Python script every hour by adding a cron job to the system.
- The cron job ensures that the Python script runs periodically (every hour) without manual intervention.

Crontab -l

google_trends_mail.py

import smtplib
from email.mime.text import MIMEText

from email.mime.multipart import MIMEMultipart

import requests



def send_email(subject, body, to_email):

    from_email = "your_email@example.com"

    from_password = "your_email_password"  # Use App Password if using Gmail with 2FA enabled

    smtp_server = "smtp.gmail.com"

    smtp_port = 587



    # Set up the MIME

    msg = MIMEMultipart()

    msg['From'] = from_email

    msg['To'] = to_email

    msg['Subject'] = subject

    msg.attach(MIMEText(body, 'plain'))



    # Establish connection to the server

    try:

        server = smtplib.SMTP(smtp_server, smtp_port)

        server.starttls()

        server.login(from_email, from_password)

        text = msg.as_string()

        server.sendmail(from_email, to_email, text)

        server.quit()

        print("Email sent successfully")

    except Exception as e:

        print(f"Error sending email: {e}")



def get_trending_topics(geo='NL'):

    url = 'https://trends.google.com/trends/api/dailytrends'

    params = {

        'hl': 'en-US',

        'tz': '360',

        'geo': geo,

    }



    # Fetching the trending topics

    response = requests.get(url, params=params)

    if response.status_code == 200:

        try:

            trends_data = response.json()

            trending_topics = []



            for trend in trends_data['default']['trendingSearchesDays']:

                for topic in trend['trendingSearches']:

                    title = topic['title']['query']

                    trending_topics.append(title)



            return trending_topics

        except Exception as e:

            print(f"Error parsing the trends data: {e}")

            return []

    else:

        print(f"Failed to retrieve data: {response.status_code}")

        return []



if __name__ == "__main__":

    # Fetch the trending topics

    trending_topics = get_trending_topics()



    # If there are trending topics, send an email

    if trending_topics:

        subject = "Google Trending Topics"

        body = "Here are the trending topics:\n\n" + "\n".join(trending_topics)



        # Send the email notification

        send_email(subject, body, "recipient_email@example.com")

    else:

        print("No trending topics found.")

Linux Setup:

Configured the Linux environment to:
- Install required dependencies (e.g., Python, necessary libraries).
- Setup and manage the cron job for hourly execution of the script.
- Handle any potential issues with execution and output logging.

Ansible Automation:

To automate the deployment and setup process on the server (or multiple servers), Ansible was used.:
- Ansible Playbook: Ansible playbook was wrote to automate the installation and configuration of the Python environment, dependencies, and cron job setup.
- The playbook ensured that the script was set up in an idempotent manner (i.e., running it multiple times would not cause issues) and helped streamline the deployment across servers.
playbook.yml

---
- name: Deploy Google Trends App

  hosts: all

  become: true

  vars:

    app_name: google_trends

    app_dir: /home/tengin/taner_project_env/googletrends-project

    venv_dir: "{{ app_dir }}/venv"

    requirements_file: "{{ app_dir }}/requirements.txt"

    service_file: "/etc/systemd/system/{{ app_name }}.service"

    gunicorn_exec: "{{ venv_dir }}/bin/gunicorn"

    app_module: "google_trends:app"

    user: "{{ lookup('env', 'USER') }}"  # Retrieve the current user from the environment

  tasks:

    - name: Ensure Python is installed

      apt:

        name: python3

        state: present

        update_cache: yes



    - name: Ensure pip is installed

      apt:

        name: python3-pip

        state: present

        update_cache: yes



    - name: Create a virtual environment

      command: python3 -m venv "{{ venv_dir }}"

      args:

        creates: "{{ venv_dir }}/bin/activate"



    - name: Install requirements

      command: "{{ venv_dir }}/bin/pip install -r {{ requirements_file }}"

      args:

        chdir: "{{ app_dir }}"



    - name: Create the systemd service file

      copy:

        dest: "{{ service_file }}"

        content: |

          [Unit]

          Description=Google Trends Flask Application

          After=network.target



          [Service]

          User={{ user }}

          WorkingDirectory={{ app_dir }}

          Environment="PATH={{ venv_dir }}/bin"

          ExecStart={{ gunicorn_exec }} -w 4 -b 0.0.0.0:8000 {{ app_module }}

          Restart=always



          [Install]

          WantedBy=multi-user.target

      notify:

        - Reload systemd

        - Restart app service



    - name: Enable the application service

      systemd:

        name: "{{ app_name }}"

        enabled: yes

        state: started



  handlers:

    - name: Reload systemd

      command: systemctl daemon-reload



    - name: Restart app service

      systemd:

        name: "{{ app_name }}"

        state: restarted

Azure DevOps Pipeline:

Azure DevOps Pipeline was utilized for continuous integration/continuous deployment (CI/CD):
- Pipeline Configuration: A pipeline set up that automates testing, building, and deploying the Python script to the Linux server.
- This process helps with version control and ensures that any changes or updates to the Python script are automatically deployed to the server.
- The pipeline can trigger the deployment when code changes are pushed to the repository, ensuring that the latest version of the script is always running.

End Result:

Now automated system has that:
- Runs every hour to fetch Google Trends data.
- Logs the trending topics for analysis or reporting.
- Is fully automated via cron, Ansible, and Azure DevOps pipelines.
This system is easy to scale and maintain due to automation at every level (cron for scheduling, Ansible for deployment, and Azure DevOps for CI/CD).

In Summary:

This project involved setting up an end-to-end automation pipeline where we:

Python script was wrote to collect Google Trends data.
Automated the execution of the script hourly using cron.
Automated the deployment and setup of the system using Ansible.
Used Azure DevOps pipelines to ensure continuous delivery of updates and ensure the script is always up-to-date on the server.

This approach provides flexibility and scalability to automate the collection and processing of trending topics data, with robust deployment and maintenance procedures in place.