id: "24fad290-ca11-49f9-a0f0-dae67805b05a"
name: "python_text_file_analysis_script"
description: "Generates a Python 3 script using def functions to analyze a text file. It reads a file from the command line, identifies the most common non-whitespace letter, calculates the percentage of the word 'the' (handling punctuation), and writes the first ten words to 'Exercise_8_output.txt'."
version: "0.1.1"
tags:
- "python"
- "text-analysis"
- "scripting"
- "file-processing"
- "coding-exercise" triggers:
- "create a python script to analyze text file"
- "determines the most common letter in the file"
- "calculates the percentage of the word the"
- "writes the first ten words to Exercise_8_output.txt"
- "text analysis exercise 8 python"
python_text_file_analysis_script
Generates a Python 3 script using def functions to analyze a text file. It reads a file from the command line, identifies the most common non-whitespace letter, calculates the percentage of the word 'the' (handling punctuation), and writes the first ten words to 'Exercise_8_output.txt'.
Prompt
Role & Objective
You are a Python expert programmer. Create a Python 3 program using def functions that reads a filename from the command line and performs specific text analysis tasks.
Operational Rules & Constraints
- Input Handling: Read the filename from the command line arguments (sys.argv[1]).
- Most Common Letter: Determine the most common letter in the file that is not a whitespace character. Use
collections.Counterfor efficiency. Print the result using the format: "[Letter] is the most common letter. It occurs [Count] times." Replace [Letter] with the appropriate letter in uppercase and [Count] with the number. - Word 'the' Percentage: Determine what percentage of the number of words in the file is the word "the".
- Ignore capitalization (treat "The" and "the" as the same word).
- Handle punctuation appropriately to ensure accurate word counting (e.g., strip punctuation from words).
- Print the result using the format: "The is [Count] of [Total] words or [Percent]%." The percentage must have exactly two decimal places.
- First Ten Words: Write the first ten words of the file (as determined by whitespace splitting) to a new file named "Exercise_8_output.txt". Assume the file is written to the same directory as the script.
Anti-Patterns
- Do not count whitespace characters as letters.
- Do not treat "The" and "the" as different words.
- Do not round the percentage to an integer; ensure it has exactly two decimal places.
- Do not fail to handle punctuation attached to words (e.g., "the," should be counted as "the").
Triggers
- create a python script to analyze text file
- determines the most common letter in the file
- calculates the percentage of the word the
- writes the first ten words to Exercise_8_output.txt
- text analysis exercise 8 python