Manamil Dev Logo
Facebook Icon X Icon Linkedin Icon

Muhammad Manamil on November 13, 2025

12 Python Scripts to Automate Tasks & Save Hours Daily

12 Python Scripts to Automate Tasks & Save Hours Daily
Share on X Share on Facebook Share on LinkedIn

12 Python Scripts to Automate Tasks & Save Hours Daily

Hey everyone! As a developer, I spend my life seeking efficiency. We all know Python is the Swiss Army knife of scripting, but sometimes the "12 scripts to rename your downloads" lists just don't cut it. We need real power. We need automations that tackle the tedious, the complex, and the truly time-consuming tasks that crop up daily, especially in a professional setting.

I've put together a list of 12 highly practical, copy-paste-ready Python scripts that I actually use to shave hours off my workflow every single week. These aren't just file organizers; these are scripts that dive into system resources, handle asynchronous data, and manipulate files at a deep level.

Ready to level up your productivity? Let's dive in. 🚀

1. The Clipboard Rule Engine: Your Text Transformation Butler

Problem Solved

How often do you copy a URL only to realize you need to strip the tracking parameters? Or maybe you copy a Python function name and need it in snake_case for your documentation? Manual reformatting is a massive time sink.

Solution Overview

This script runs in the background, monitors your clipboard, and applies a set of user-defined Regular Expression (RegEx) rules. If a rule matches the copied text, it automatically replaces it with the transformed output. Think of it as IFTTT for your clipboard.

The Code

import pyperclip
import re
import time

RULES = [
    # 1. Clean Google/Affiliate Tracking URLs
    (re.compile(r'(\?|&)(utm_.*?|aff_.*?|ref=.*?)=[^&]*'), ''),
    
    # 2. Convert PascalCase to snake_case
    (re.compile(r'(.)([A-Z][a-z]+)'), r'\1_\2'),
    (re.compile(r'([a-z0-9])([A-Z])'), r'\1_\2')
]

def apply_rules(text):
    for pattern, replacement in RULES:
        text = pattern.sub(replacement, text).lower()
    return text

if __name__ == "__main__":
    recent_value = pyperclip.paste()
    print("Clipboard Engine running... Press Ctrl+C to stop.")
    
    while True:
        try:
            current_value = pyperclip.paste()
            if current_value != recent_value:
                transformed_value = apply_rules(current_value)
                
                if transformed_value != current_value:
                    print(f"Clipboard updated. Original: '{current_value[:30]}...' -> New: '{transformed_value[:30]}...'")
                    pyperclip.copy(transformed_value)
                
                recent_value = pyperclip.paste() # Update with the potentially new value
            time.sleep(0.5)
        except KeyboardInterrupt:
            break

Code Explanation & Use Cases

This script uses the awesome pyperclip library (you might need pip install pyperclip) to access the system clipboard. It runs an infinite loop, checking the clipboard every 0.5 seconds.

The RULES list holds pairs of (RegEx_Pattern, Replacement_String). The magic happens when we find a new value; we apply the transformations, and if the output is different, we write it back to the clipboard. I use the URL cleaning rule daily when sharing links—it keeps URLs short and clean!

2. Zero-Copy File Transfer via mmap

Problem Solved

Copying a very large file (e.g., a 10GB dataset or log file) using standard Python I/O (.read()/.write()) is slow because the OS has to copy the data twice: first from disk to OS kernel buffer, and then from the kernel buffer to the Python user-space buffer.

Solution Overview

The mmap module allows us to map a file directly into the process's memory space. This drastically reduces the overhead, effectively creating a "zero-copy" operation for copying data streams, which is a huge win for massive files.

The Code

import mmap
import os
import shutil

def zero_copy_transfer(source_path, dest_path):
    """Copies a file using memory mapping for speed."""
    if not os.path.exists(source_path):
        print(f"Error: Source file not found at {source_path}")
        return

    # Use shutil.copyfile if the file is small (mmap overhead isn't worth it)
    file_size = os.path.getsize(source_path)
    if file_size < 1024 * 1024 * 5:  # 5 MB threshold
        shutil.copyfile(source_path, dest_path)
        print(f"Standard copy used for small file: {source_path}")
        return

    try:
        with open(source_path, 'rb') as f_in, open(dest_path, 'wb') as f_out:
            # Resize the output file to the same size as the input file
            f_out.truncate(file_size) 
            
            # Map the file into memory
            with mmap.mmap(f_out.fileno(), file_size, access=mmap.ACCESS_WRITE) as m:
                # Read the source data directly into the memory-mapped object
                data = f_in.read()
                m.write(data)
                m.flush()
        
        print(f"Zero-copy transfer complete: {source_path} -> {dest_path}")
        
    except Exception as e:
        print(f"An error occurred: {e}")

if __name__ == "__main__":
    # Replace these paths with your large file test
    SOURCE = 'large_log_file.txt'  
    DEST = 'large_log_file_copy.txt' 
    
    # NOTE: You need a large file at SOURCE path to test this effectively
    # You might want to create a dummy 100MB file first!
    
    # zero_copy_transfer(SOURCE, DEST)
    print("Script ready. Replace file names and uncomment 'zero_copy_transfer' to run.")

Code Explanation & Use Cases

We open both source and destination files. The key step is f_out.truncate(file_size) which pre-allocates the space for the destination file. Then, mmap.mmap(f_out.fileno(), file_size, access=mmap.ACCESS_WRITE) maps the output file's underlying physical disk space into our process's virtual memory. When we call m.write(data), Python essentially copies the data from the source file stream directly to this mapped memory, bypassing the need for temporary buffers, making it lightning fast for huge files. This is essential for moving large database dumps or video assets.

3. Incremental Hardlink Backups

Problem Solved

Traditional backups waste disk space by creating a full copy every time. But if you only use simple syncing (like rsync), you lose the version history. We need the space efficiency of syncing and the safety of versioning.

Solution Overview

This script creates daily snapshots using hardlinks. A hardlink is a second entry in the file system for the same file data. When the script runs, it creates a new snapshot folder. It hardlinks the files that haven't changed since the last backup (zero space used!) and physically copies only the files that have changed.

The Code

import os
import time
import datetime
import shutil

SOURCE_DIR = '/path/to/your/important/data'  # <-- CHANGE THIS
BACKUP_ROOT = '/path/to/your/backup/drive'    # <-- CHANGE THIS

def get_last_snapshot(root_dir):
    """Finds the path to the most recent daily snapshot."""
    snapshots = [d for d in os.listdir(root_dir) if os.path.isdir(os.path.join(root_dir, d))]
    if not snapshots:
        return None
    snapshots.sort(reverse=True)
    return os.path.join(root_dir, snapshots[0])

def run_hardlink_backup():
    timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H%M%S")
    new_snapshot_dir = os.path.join(BACKUP_ROOT, timestamp)
    
    last_snapshot_dir = get_last_snapshot(BACKUP_ROOT)
    
    print(f"Creating new snapshot: {new_snapshot_dir}")
    os.makedirs(new_snapshot_dir, exist_ok=True)
    
    # 1. Hardlink all unchanged files from the last snapshot
    if last_snapshot_dir:
        for root, _, files in os.walk(last_snapshot_dir):
            relative_path = os.path.relpath(root, last_snapshot_dir)
            target_root = os.path.join(new_snapshot_dir, relative_path)
            os.makedirs(target_root, exist_ok=True)
            
            for file in files:
                src_file = os.path.join(root, file)
                dest_file = os.path.join(target_root, file)
                try:
                    # Skip if the source file doesn't exist anymore
                    if os.path.exists(os.path.join(SOURCE_DIR, relative_path, file)):
                        os.link(src_file, dest_file)
                except Exception:
                    pass # Ignore errors for now

    # 2. Copy/Overwrite changed files from the SOURCE_DIR
    for root, dirs, files in os.walk(SOURCE_DIR):
        relative_path = os.path.relpath(root, SOURCE_DIR)
        target_root = os.path.join(new_snapshot_dir, relative_path)
        os.makedirs(target_root, exist_ok=True)
        
        for file in files:
            src_file = os.path.join(root, file)
            dest_file = os.path.join(target_root, file)
            
            # Check if file has changed (simple check: size/mod time comparison)
            if not last_snapshot_dir or \
               not os.path.exists(os.path.join(last_snapshot_dir, relative_path, file)) or \
               os.path.getmtime(src_file) > os.path.getmtime(os.path.join(last_snapshot_dir, relative_path, file)):
                
                print(f"Copying (changed/new): {file}")
                shutil.copy2(src_file, dest_file) # copy2 preserves metadata
            # Else: File already hardlinked from step 1

    print("Backup finished.")

if __name__ == "__main__":
    # IMPORTANT: Update SOURCE_DIR and BACKUP_ROOT paths
    # run_hardlink_backup()
    print("Script ready. Please update paths and uncomment 'run_hardlink_backup' to run.")

Code Explanation & Use Cases

This script relies on os.link(), which creates a hardlink. Hardlinks only work within the same disk/partition. The core logic is:

  1. Find the path of the last backup (get_last_snapshot).

  2. Create the new snapshot folder.

  3. Walk the last snapshot and hardlink everything to the new snapshot. (This is fast and uses no extra space.)

  4. Walk the source data. If a file is new or has a newer modification time (os.path.getmtime), copy it over the hardlink in the new snapshot, overwriting it.

This is a fantastic automation to run via Cron or Windows Task Scheduler for true, space-efficient, versioned local backups.

4. Auto-Detect Broken Links in Local Markdown Docs

Problem Solved

If you manage a documentation site, a large README, or local knowledge base built with Markdown, broken internal or external links are a constant headache. Manual checking is impossible.

Solution Overview

This script scans all Markdown files in a directory, extracts all [text](link) patterns, and then checks the validity of each link. It uses standard file checks for internal links and the requests library for external URLs.

The Code

import os
import re
import requests

DOCS_DIR = './docs' # <-- CHANGE THIS to your docs folder

def check_link(link, base_path):
    """Checks if a link is valid (local file or external URL)."""
    if link.startswith('http'):
        # External link check
        try:
            response = requests.head(link, timeout=5, allow_redirects=True)
            if response.status_code >= 400:
                return f"External link broken: Status {response.status_code}"
            return "OK"
        except requests.RequestException as e:
            return f"External link failed: {e.__class__.__name__}"
    elif link.startswith('#'):
        # Anchor links are tricky to validate robustly, skipping for simplicity
        return "Anchor (SKIPPED)" 
    else:
        # Local file check
        full_path = os.path.join(os.path.dirname(base_path), link)
        if not os.path.exists(full_path):
            # Check if it's an absolute path from DOCS_DIR root
            if not os.path.exists(os.path.join(DOCS_DIR, link)):
                 return f"Local file missing: {full_path}"
            
        return "OK"

def find_and_check_links():
    # RegEx to find Markdown links: [text](link)
    LINK_REGEX = re.compile(r'\[.*?\]\((.*?)\)') 
    
    print(f"Scanning for links in {DOCS_DIR}...")
    broken_links = 0

    for root, _, files in os.walk(DOCS_DIR):
        for file_name in files:
            if file_name.endswith('.md'):
                file_path = os.path.join(root, file_name)
                with open(file_path, 'r', encoding='utf-8') as f:
                    content = f.read()
                    
                    for match in LINK_REGEX.finditer(content):
                        link = match.group(1).split()[0] # Take link, ignore optional title
                        result = check_link(link, file_path)
                        
                        if result != "OK" and "SKIPPED" not in result:
                            broken_links += 1
                            print(f"[🚨 BROKEN] File: {file_path}, Link: {link}, Reason: {result}")

    if broken_links == 0:
        print("\nâś… All links seem fine!")
    else:
        print(f"\n❌ Found {broken_links} broken links.")

if __name__ == "__main__":
    # You need the 'requests' library: pip install requests
    # find_and_check_links()
    print("Script ready. Update DOCS_DIR and uncomment 'find_and_check_links' to run.")

Code Explanation & Use Cases

This script requires the requests library. We use os.walk to find all .md files. The RegEx r'\[.*?\]\((.*?)\)' grabs the URL within the parentheses. The check_link function distinguishes between local file paths (using os.path.exists) and external URLs (using requests.head to check the status code without downloading the entire page). Running this before every documentation deployment is a massive quality-of-life improvement.

5. Email Triage via IMAP IDLE (Real-time Email Notifications)

Problem Solved

Polling your mailbox every few minutes to check for urgent emails is inefficient and slow. For true real-time email triage without constantly checking a browser tab, we need a better method.

Solution Overview

The IMAP IDLE command allows a client (our Python script) to instruct the server (like Gmail) to notify it immediately when a change (like a new email) occurs in a monitored folder. This is a low-latency, low-resource way to check for highly specific, urgent emails (e.g., "Deployment Failed").

The Code

import imaplib
import time
import re

# --- CONFIGURATION ---
IMAP_SERVER = 'imap.gmail.com'
EMAIL_ADDRESS = 'your_email@gmail.com' # <-- CHANGE THIS
PASSWORD = 'your_app_password'         # <-- CHANGE THIS (Use App Password for Gmail)
MONITOR_FOLDER = 'INBOX'
URGENT_REGEX = re.compile(r'(failed|alert|critical|deploy|error)', re.IGNORECASE)
# ---------------------

def imap_idle_triage():
    try:
        # Connect and Login
        mail = imaplib.IMAP4_SSL(IMAP_SERVER)
        mail.login(EMAIL_ADDRESS, PASSWORD)
        mail.select(MONITOR_FOLDER)
        print(f"Logged in and monitoring {MONITOR_FOLDER} using IDLE...")

        while True:
            # Send IDLE command
            mail.send(b'A001 IDLE\r\n')
            
            # Wait for the server to send a response (new data or timeout)
            response = mail.readline()
            
            # If the server responds with an update (like EXISTS or RECENT)
            if b'OK' not in response:
                print("\nNew mail activity detected!")
                mail.send(b'DONE\r\n') # Stop IDLE command
                mail.readline()      # Consume server response
                
                # Check for new, urgent emails
                status, messages = mail.search(None, 'UNSEEN')
                
                if status == 'OK':
                    for msg_id in messages[0].split():
                        # Fetch the Envelope/Subject only
                        status, data = mail.fetch(msg_id, '(ENVELOPE)')
                        
                        if status == 'OK':
                            envelope_data = data[0][1].decode('utf-8')
                            # Simple extraction of the subject line from the envelope data (complex regex needed for full robust extraction)
                            subject_match = re.search(r'"([^"]*)"', envelope_data) 
                            subject = subject_match.group(1) if subject_match else "[Subject Not Found]"

                            if URGENT_REGEX.search(subject):
                                print(f"🚨 URGENT EMAIL: {subject}")
                                # Add code here to trigger a system notification or a text message
                            else:
                                print(f"New email: {subject}")
                                
                    mail.select(MONITOR_FOLDER) # Reselect the folder
            
            time.sleep(1) # Small pause before next IDLE check/loop

    except Exception as e:
        print(f"IMAP IDLE error: {e}")
    finally:
        try:
            mail.logout()
        except:
            pass

if __name__ == "__main__":
    # IMPORTANT: Update configurations and use an App Password for secure login
    # imap_idle_triage()
    print("Script ready. Update config and uncomment 'imap_idle_triage' to run.")

Code Explanation & Use Cases

This is a more advanced script. You must use an App Password (not your regular account password) for services like Gmail. We connect, select the folder, and send the IDLE command. The script then waits (mail.readline()) until the server pushes a response. When activity is detected, we stop the IDLE (DONE), search for UNSEEN messages, and check their subject against our URGENT_REGEX. This is priceless for DevOps engineers or anyone needing immediate alerts on server failures or critical system health checks.

6. OCR + Full-Text Search for Screenshots

Problem Solved

Screenshots contain valuable text (error messages, code snippets, meeting notes) that is inaccessible to your system's search bar. Finding that one old screenshot is a nightmare.

Solution Overview

This script processes a directory of screenshots, uses an Optical Character Recognition (OCR) tool (like Tesseract, via the pytesseract library) to extract all text, and stores the text in a simple SQLite database alongside the file path. You can then query the database for any text found in any of your images.

The Code

import os
import sqlite3
from PIL import Image
import pytesseract # pip install pytesseract, and install Tesseract-OCR engine separately!

SCREENSHOTS_DIR = './screenshots' # <-- CHANGE THIS
DB_PATH = 'ocr_index.db'
IMAGE_EXTENSIONS = ('.png', '.jpg', '.jpeg')

def create_db(conn):
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS ocr_data (
            filepath TEXT PRIMARY KEY,
            ocr_text TEXT,
            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
        )
    """)
    conn.commit()

def process_screenshots():
    conn = sqlite3.connect(DB_PATH)
    create_db(conn)
    cursor = conn.cursor()

    for root, _, files in os.walk(SCREENSHOTS_DIR):
        for file_name in files:
            if file_name.lower().endswith(IMAGE_EXTENSIONS):
                file_path = os.path.join(root, file_name)
                
                # Check if already indexed
                cursor.execute("SELECT filepath FROM ocr_data WHERE filepath = ?", (file_path,))
                if cursor.fetchone():
                    continue

                try:
                    # OCR process
                    text = pytesseract.image_to_string(Image.open(file_path))
                    
                    # Store data
                    cursor.execute("INSERT INTO ocr_data (filepath, ocr_text) VALUES (?, ?)", 
                                   (file_path, text))
                    conn.commit()
                    print(f"Indexed: {file_name}")
                except Exception as e:
                    print(f"Could not process {file_name}: {e}")

    conn.close()

def search_screenshots(query):
    conn = sqlite3.connect(DB_PATH)
    cursor = conn.cursor()
    
    # Use LIKE for simple full-text search
    cursor.execute("SELECT filepath, ocr_text FROM ocr_data WHERE ocr_text LIKE ?", (f'%{query}%',))
    results = cursor.fetchall()
    conn.close()
    
    if results:
        print(f"\n--- Found {len(results)} matches for '{query}' ---")
        for filepath, text in results:
            print(f"Match in: {filepath}")
            # print(f"Snippet: {text[:200]}...") # Optional: print snippet
    else:
        print(f"\nNo matches found for '{query}'.")

if __name__ == "__main__":
    # You need pytesseract and the Tesseract engine installed
    # process_screenshots()
    # search_screenshots("your search term here")
    print("Script ready. Install pytesseract/Tesseract, update config, then run process/search functions.")

Code Explanation & Use Cases

This script needs pytesseract and the underlying Tesseract OCR engine (must be installed separately on your OS). We use sqlite3 to create a lightweight, local database. The process_screenshots function iterates over the image files, runs pytesseract.image_to_string(), and inserts the result into the DB. The search_screenshots function then allows you to query the indexed text. This is a life-saver for finding old error traces or configuration details hidden in images.

7. Quick AST Import Tracker

Problem Solved

When reviewing legacy code or cleaning up a massive Python file, you often wonder which imported modules are actually used by the logic. Commenting out imports until the code breaks is frustrating.

Solution Overview

The ast (Abstract Syntax Tree) module allows Python to inspect its own structure. This script reads a Python file, parses its AST, and reports all imported modules (including from X import Y) but excludes any imports that are never referenced in the subsequent code body.

The Code

import ast
import sys
from collections import defaultdict

def analyze_python_file(filepath):
    """Analyzes a Python file to track imported modules and their usage."""
    with open(filepath, 'r') as f:
        tree = ast.parse(f.read())

    imported_names = {} # {alias: module_name or name}
    used_names = set()

    # 1. Collect all imported names
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                imported_names[alias.asname or alias.name] = alias.name
        elif isinstance(node, ast.ImportFrom):
            if node.module:
                for alias in node.names:
                    imported_names[alias.asname or alias.name] = f"{node.module}.{alias.name}"
            
    # 2. Find all names used in the code
    for node in ast.walk(tree):
        if isinstance(node, ast.Name) and isinstance(node.ctx, (ast.Load, ast.Store, ast.Del)):
            used_names.add(node.id)

    # 3. Compare and Report
    unused_imports = {}
    for alias, module_name in imported_names.items():
        # Check if the alias (or the name itself) was ever used
        if alias not in used_names:
            unused_imports[alias] = module_name
            
    print(f"--- Unused Imports in {filepath} ---")
    if unused_imports:
        for alias, module_name in unused_imports.items():
            print(f"Potential unused import: {module_name} (as {alias})")
    else:
        print("âś… Clean: No immediately obvious unused imports found.")
        
    return unused_imports

if __name__ == "__main__":
    # Example usage: Pass a Python file path
    # If you run this script on itself, it will find its own unused imports (if any)
    if len(sys.argv) < 2:
        print(f"Usage: python {sys.argv[0]} ")
    else:
        filepath = sys.argv[1]
        # analyze_python_file(filepath)
        print("Script ready. Pass a Python file path as an argument and uncomment 'analyze_python_file' to run.")

Code Explanation & Use Cases

This is a sophisticated static analysis tool. The ast module creates a tree representation of the code. We first walk the tree and populate imported_names with every alias and module. We then walk the tree again to find every ast.Name node that is being Loaded (i.e., used). By comparing the two sets (imported_names vs. used_names), we can identify modules that were imported but whose names never appeared in the code body. This is a powerful, dependency-free way to clean up large files and improve cold-start times.

8. Git Commit Time Visualizer (Activity Heatmap)

Problem Solved

Sometimes you need to analyze when you (or your team) are most productive, or figure out why commits stop abruptly. The standard git log is too verbose for this.

Solution Overview

This script uses the subprocess module to run git log, extracts the commit timestamps, and generates a simple, text-based heatmap showing commit activity broken down by hour of the day and day of the week.

The Code

import subprocess
import datetime
from collections import defaultdict

def create_commit_heatmap(repo_path='.'):
    """Generates a text-based heatmap of commit times."""
    
    # Run git log and format the output as raw timestamps
    try:
        command = [
            'git', 'log', '--all', 
            '--pretty=format:%at' # %at is author timestamp (seconds since epoch)
        ]
        result = subprocess.run(
            command, 
            cwd=repo_path, 
            capture_output=True, 
            text=True, 
            check=True
        )
        timestamps = result.stdout.strip().split('\n')
    except subprocess.CalledProcessError as e:
        print(f"Error running git: {e.stderr.strip()}")
        return
    except FileNotFoundError:
        print("Error: Git executable not found.")
        return

    # Initialize heatmap: [Day of Week][Hour of Day]
    heatmap = defaultdict(lambda: defaultdict(int))
    
    for ts in timestamps:
        try:
            timestamp = int(ts)
            dt_object = datetime.datetime.fromtimestamp(timestamp)
            day_of_week = dt_object.weekday() # Monday is 0, Sunday is 6
            hour_of_day = dt_object.hour
            heatmap[day_of_week][hour_of_day] += 1
        except ValueError:
            continue

    # Visualization
    DAYS = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
    HOURS = [f"{h:02d}" for h in range(24)]
    
    print("\n--- Git Commit Activity Heatmap (Commits Per Hour/Day) ---")
    print("       " + " ".join(HOURS))
    print("     " + "-" * (3 * 24))
    
    for d_index, day in enumerate(DAYS):
        row = f"{day} | "
        for h_index in range(24):
            count = heatmap[d_index][h_index]
            # Use simple characters for visualization
            if count == 0:
                char = ' .' 
            elif count < 5:
                char = ' â–‘'
            elif count < 15:
                char = ' â–’'
            else:
                char = ' â–“'
            row += char
        print(row)

if __name__ == "__main__":
    # create_commit_heatmap('.') # Runs on the current directory if it's a Git repo
    print("Script ready. Uncomment 'create_commit_heatmap' in a Git repository to run.")

Code Explanation & Use Cases

We use subprocess to execute git log --pretty=format:%at, which is the fastest way to get raw timestamps. The output is parsed into a datetime object. We then use a defaultdict to count commits by day (weekday() 0-6) and hour (hour 0-23). The final print loop iterates through the structure, using simple Unicode block characters (â–‘, â–’, â–“) to create a visual heatmap. I love using this to quickly diagnose burnout or spot patterns in team collaboration.

9. Daily Log Summarizer (Jupyter Notebook/Text Files)

Problem Solved

Daily logs, meeting notes, or research journals grow quickly into thousands of lines. Finding the key takeaways or action items from last week's notes is a huge effort.

Solution Overview

This script uses the gensim library (specifically the TextRank algorithm) for unsupervised text summarization. It takes a collection of text files (or a single large log file) and generates a concise, fixed-length summary of the content.

The Code

import os
from gensim.summarization import summarize # pip install gensim
from collections import deque

LOG_DIR = './daily_logs' # <-- CHANGE THIS
SUMMARY_LENGTH = 0.3 # 30% of original text length

def summarize_logs():
    all_text = []
    
    # Load all text files in the log directory
    for root, _, files in os.walk(LOG_DIR):
        for file_name in files:
            if file_name.endswith(('.txt', '.log')):
                file_path = os.path.join(root, file_name)
                try:
                    with open(file_path, 'r', encoding='utf-8') as f:
                        all_text.append(f.read())
                except Exception:
                    continue
    
    # Combine text for holistic summarization
    combined_text = "\n\n".join(all_text)
    
    if not combined_text.strip():
        print("No content found in logs to summarize.")
        return

    try:
        # Use TextRank algorithm for summarization
        summary = summarize(combined_text, ratio=SUMMARY_LENGTH) 
        
        print("\n--- Daily Log Summary ---")
        print(summary)
        print("-" * 30)
        
        # Optional: Save the summary
        with open('daily_summary.txt', 'w', encoding='utf-8') as f:
            f.write(summary)
        print("Summary saved to daily_summary.txt")

    except Exception as e:
        print(f"Summarization failed (check gensim/TextRank requirements): {e}")

if __name__ == "__main__":
    # You need the 'gensim' library: pip install gensim
    # summarize_logs()
    print("Script ready. Install gensim, update LOG_DIR, and uncomment 'summarize_logs' to run.")

Code Explanation & Use Cases

This relies on gensim's summarization module. The summarize function implements the TextRank algorithm, which identifies the most important sentences based on graph theory (sentences that share many common words are ranked higher). By setting ratio=0.3, we ask the script to return a summary that is 30% of the original text's length. I use this every Friday to quickly review my week's work notes and compile my status report.

10. PDF Splitter/Merger with Auto-Index

Problem Solved

Splitting a 500-page bank statement or a large technical manual into individual chapters is a manual drag. Merging several small reports into one indexed PDF is also a chore.

Solution Overview

The PyPDF2 library (or the newer pypdf) is a workhorse for PDF manipulation. This script provides functions to split a PDF based on page ranges and merge multiple PDFs into one, automatically adding a Table of Contents (TOC) using bookmarks if required.

The Code

from PyPDF2 import PdfReader, PdfWriter # pip install PyPDF2
import os

def pdf_split(input_path, output_dir, page_ranges):
    """Splits a PDF based on a dictionary of {title: [start_page, end_page]}."""
    reader = PdfReader(input_path)
    os.makedirs(output_dir, exist_ok=True)

    for title, (start, end) in page_ranges.items():
        writer = PdfWriter()
        # Pages are 0-indexed, so we use start-1 to end
        for page_num in range(start - 1, end): 
            writer.add_page(reader.pages[page_num])

        output_path = os.path.join(output_dir, f"{title}.pdf")
        with open(output_path, "wb") as output_file:
            writer.write(output_file)
        print(f"Created: {output_path}")

def pdf_merge(file_list, output_path):
    """Merges a list of PDFs and adds bookmarks for indexing."""
    merger = PdfWriter()

    for path in file_list:
        title = os.path.basename(path).replace(".pdf", "")
        # Get the page number where this file starts
        page_start = len(merger.pages) 
        merger.append(path)
        # Add a bookmark pointing to the start of the merged file
        merger.add_outline_item(title, page_start) 

    with open(output_path, "wb") as output_file:
        merger.write(output_file)
    print(f"Merged and indexed PDF created: {output_path}")

if __name__ == "__main__":
    # You need PyPDF2: pip install PyPDF2
    
    # --- SPLIT EXAMPLE ---
    # INPUT_PDF = 'large_manual.pdf' # <-- CHANGE THIS
    # RANGES = {
    #     "Chapter 1 - Intro": [1, 10], 
    #     "Chapter 2 - Setup": [11, 25]
    # }
    # pdf_split(INPUT_PDF, './split_output', RANGES)
    
    # --- MERGE EXAMPLE ---
    # MERGE_LIST = ['./report_a.pdf', './report_b.pdf', './report_c.pdf'] # <-- CHANGE THIS
    # pdf_merge(MERGE_LIST, 'final_combined_report.pdf')
    print("Script ready. Install PyPDF2, update file paths, and uncomment the desired function to run.")

Code Explanation & Use Cases

The PdfReader and PdfWriter objects from PyPDF2 do the heavy lifting. In pdf_split, we loop through the page indices and add them to a new PdfWriter. In pdf_merge, we use merger.append(path) to add content, and the key trick is using merger.add_outline_item(title, page_start) which creates a bookmark (TOC entry) in the resulting PDF at the page number where the new file began. This is invaluable for finance, legal, or documentation teams.

11. System Resource Dashboard (Cross-Platform)

Problem Solved

Checking CPU load, memory usage, disk activity, or network bandwidth often requires opening three different system monitors, which distracts you from your main work.

Solution Overview

The psutil library provides an excellent cross-platform way to access system metrics. This script runs a simple, real-time dashboard in the terminal, giving you key metrics at a glance.

The Code

import psutil # pip install psutil
import time
import os
import sys

def clear_screen():
    """Clears the terminal screen."""
    os.system('cls' if os.name == 'nt' else 'clear')

def get_sys_metrics():
    """Fetches core system metrics using psutil."""
    cpu_percent = psutil.cpu_percent(interval=None) # Non-blocking call
    mem_info = psutil.virtual_memory()
    disk_usage = psutil.disk_usage('/') # Use '/' for Linux/macOS, or 'C:' for Windows
    net_io = psutil.net_io_counters()

    return {
        'CPU': f"{cpu_percent:0.1f}%",
        'Memory': f"{mem_info.percent:0.1f}% ({mem_info.used / (1024**3):.2f} GB used)",
        'Disk': f"{disk_usage.percent}% ({disk_usage.free / (1024**3):.2f} GB free)",
        'Net_Sent': f"{net_io.bytes_sent / (1024**2):.2f} MB",
        'Net_Recv': f"{net_io.bytes_recv / (1024**2):.2f} MB",
    }

def run_dashboard(interval=1.0):
    clear_screen()
    print("System Resource Dashboard (Press Ctrl+C to stop)")
    print("-" * 40)
    
    # Get initial metrics to calculate CPU usage over an interval
    psutil.cpu_percent(interval=None) 

    try:
        while True:
            metrics = get_sys_metrics()
            
            # Print the dashboard
            sys.stdout.write('\033[F') # Move cursor up (only works in some terminals)
            sys.stdout.write('\033[K') # Clear line
            
            print(f"Time: {time.strftime('%H:%M:%S')}")
            print(f"CPU Usage: {metrics['CPU']}")
            print(f"Memory:    {metrics['Memory']}")
            print(f"Disk Root: {metrics['Disk']}")
            print(f"Net Sent:  {metrics['Net_Sent']}")
            print(f"Net Recv:  {metrics['Net_Recv']}")
            
            time.sleep(interval)
            
    except KeyboardInterrupt:
        print("\nDashboard stopped.")

if __name__ == "__main__":
    # You need psutil: pip install psutil
    # run_dashboard(interval=2)
    print("Script ready. Install psutil and uncomment 'run_dashboard' to run the real-time terminal monitor.")

Code Explanation & Use Cases

psutil abstracts away the OS-specific commands. The key is calling psutil.cpu_percent(interval=None) once before the loop to prime it, and then inside the loop, calling it again. It uses the difference between the two calls to get an accurate CPU usage percentage for that interval. The sys.stdout.write('\033[F') command is a neat trick to refresh the output in place in the terminal, creating a true real-time dashboard effect. I keep this running on my secondary monitor during build processes or stress tests.

12. Local Web API to Trigger Small Automations

Problem Solved

You want to run a simple Python script (like "Clean Temp Files" or "Trigger Backup") from any device on your local network (e.g., your phone or a separate machine), or trigger it easily via a desktop shortcut.

Solution Overview

We can use a super lightweight web framework like Flask to expose our simple automation functions as a local API endpoint. When you visit http://localhost:5000/trigger_cleanup, the Python function runs.

The Code

Python
from flask import Flask, jsonify # pip install flask
import os
import time

app = Flask(__name__)

# --- Simple Automation Function ---
def cleanup_temp_files():
    """A placeholder for your actual cleanup automation."""
    
    # Example: Delete files in a temp folder older than 1 day
    TEMP_DIR = './temp_storage' 
    if not os.path.exists(TEMP_DIR):
        os.makedirs(TEMP_DIR)
        return 0
        
    deleted_count = 0
    now = time.time()
    for filename in os.listdir(TEMP_DIR):
        filepath = os.path.join(TEMP_DIR, filename)
        if os.path.isfile(filepath) and now - os.stat(filepath).st_mtime > (24 * 3600):
            os.remove(filepath)
            deleted_count += 1
            
    return deleted_count

# --- API Endpoint ---
@app.route('/trigger_cleanup', methods=['POST', 'GET'])
def trigger_cleanup():
    deleted = cleanup_temp_files()
    if deleted > 0:
        message = f"Cleanup successful. Deleted {deleted} old temporary files."
    else:
        message = "Cleanup ran. No old files found or temp directory was empty."
        
    # Return a JSON response
    return jsonify({"status": "success", "message": message, "files_deleted": deleted})

@app.route('/', methods=['GET'])
def index():
    return "Automation Hub Running. Visit /trigger_cleanup to run the task."

if __name__ == "__main__":
    # Run the server on the local network IP (0.0.0.0) so other devices can access it.
    # Set host='127.0.0.1' if you only want local access.
    # You need Flask: pip install Flask
    # app.run(host='0.0.0.0', port=5000, debug=False) 
    print("Script ready. Install Flask, uncomment 'app.run', and access http://127.0.0.1:5000 in your browser.")

Code Explanation & Use Cases

This script requires Flask. We define a function cleanup_temp_files (where you'd put any short automation logic). We then use the @app.route decorator to link this function to a web URL /trigger_cleanup. When someone hits this URL, the function executes, and the server returns a clear JSON response. This is my favorite "lazy button" for tasks. I have this running on a headless Raspberry Pi, and I simply hit the URL from my phone to run a full backup.

Final Thoughts: The Power of Intentional Automation

These scripts are more than just Python code; they are little pockets of focused time-saving. By moving beyond the basics and tackling advanced, real-world annoyances like IMAP IDLE and AST parsing, you truly harness the power Python offers to us as developers.

The best automation isn't one you run once—it's the one you forget about because it's running silently in the background, making your workday smoother.

Now, take one of these, customize the configuration paths, and set it up on a scheduler (Cron, Task Scheduler, or even our Flask API). I guarantee you'll feel that satisfying click as you eliminate a repetitive task forever. Happy scripting!

Featured Posts

Categories

Read Next

November 05 2025

Laravel Cloud vs Vapor — Which One Should You Choose in 2026?

Trying to choose between Laravel Cloud and Laravel Vapor? This guide explores their main differences — from pricing and performance to deployment style and scalability — so you can confidently pick the right hosting solution for your Laravel app in 2026

November 06 2025

How to Install Free SSL on Ubuntu Server Using Certbot (Step-by-Step 2026 Guide)

Learn how to install a free SSL certificate on your Ubuntu server using Certbot and Let’s Encrypt. This step-by-step 2026 guide will help you secure your website with HTTPS, boost trust, and improve SEO — all without paying a single rupee

November 10 2025

Getting Started with Laravel 12 Starter Kits (React/Vue/Livewire)

Discover how to kickstart your Laravel 12 projects with the most popular starter kits—React, Vue, and Livewire. Step-by-step guidance for setting up, configuring, and building your first app efficiently.

Design Circle Design Circle
Manamil Dev Logo

Where ideas meet innovation. Exploring tech, design, and creativity in every line of code.

© 2025 — Revision. All Rights Reserved.