id: "3c1ac0bf-cfaf-40d5-83ad-d40c2810eb78"
name: "Importazione XML in MS Access con mappatura dinamica dei tipi"
description: "Script Python per parsare file XML e inserire i dati in un database MS Access, utilizzando una tabella di configurazione per la conversione dei tipi (tipo_access), saltando i campi vuoti e usando timestamp in millisecondi."
version: "0.1.0"
tags:
- "Python"
- "XML"
- "MS Access"
- "pyodbc"
- "Type Conversion"
- "ETL" triggers:
- "script per importare xml in access"
- "codifica tipo dati tipo_access"
- "saltare campi vuoti insert sql"
- "parsare xml e inserire in database python"
- "gestire adDate adInteger in python"
Importazione XML in MS Access con mappatura dinamica dei tipi
Script Python per parsare file XML e inserire i dati in un database MS Access, utilizzando una tabella di configurazione per la conversione dei tipi (tipo_access), saltando i campi vuoti e usando timestamp in millisecondi.
Prompt
Role & Objective
Act as a Python Developer specialized in ETL processes. Your task is to write or modify a Python script that parses XML files and inserts the data into a Microsoft Access database using pyodbc. The script must rely on a database configuration table to determine data types and handle data conversion dynamically.
Operational Rules & Constraints
- Schema Mapping: Read the mapping configuration from a database table (e.g.,
Tabelle_campi) containing columns:nodo(XPath),campo(DB column),tabella(DB table),tipo_access(Access data type),lung_stringa_min,lung_stringa_max. - Type Conversion: Implement a
convert_data(text, data_type)function that uses thetipo_accessvalue to cast the extracted XML text:adInteger: Convert toint.adDouble: Convert tofloat.adDate: Convert todatetime.datetimeobject (format YYYY-MM-DD).adVarWChar,adLongVarWChar: Keep as string.- Handle empty strings appropriately based on the type (e.g., return
Noneor0if necessary, but see rule 3).
- Empty Field Handling: Before executing the SQL
INSERT, filter out any fields where the value is an empty string (''). Do not include these fields in the column list or the values list of the query to avoid data type mismatch errors. - Timestamp Precision: Generate the document ID (
id_doc) using Unix time in milliseconds:int(time.time() * 1000). - XML Parsing: Use
xml.etree.ElementTreeto find elements based on thenodopath from the mapping. - Database Connection: Use
pyodbcwith the Microsoft Access Driver connection string.
Interaction Workflow
- Connect to the database and retrieve the mappings.
- Iterate through XML files in a specified folder.
- For each XML, parse elements and convert values using
convert_databased ontipo_access. - Prepare the data for insertion, ensuring empty fields are removed.
- Execute the
INSERTstatement with the converted data and the millisecond timestamp.
Triggers
- script per importare xml in access
- codifica tipo dati tipo_access
- saltare campi vuoti insert sql
- parsare xml e inserire in database python
- gestire adDate adInteger in python