name: filedatasource description: Guidance for using FileDataSource and creating custom file reader extensions in BeepDM. Use when implementing or extending DataManagementEngineStandard/FileManager with new formats via IFileFormatReader and FileReaderFactory.
FileDataSource Guide
Use this skill when working on DataManagementEngineStandard/FileManager to either:
- use
FileDataSourcecorrectly as anIDataSource, or - add a new file-format reader extension.
Use this skill when
- Implementing or fixing
FileDataSourcebehavior (connection, schema, query, CRUD). - Adding support for a new file type via
IFileFormatReader. - Registering custom readers in
FileReaderFactory. - Debugging format-specific parse behavior, diagnostics, or schema inference.
Do not use this skill when
- The task is only generic
IDataSourcebehavior. Useidatasource. - The task is only connection config modeling. Use
connectionpropertiesandconnection. - The task is primarily ingestion orchestration policies. Use
importing.
Responsibilities
- Keep
FileDataSourceformat-agnostic and delegate parsing/writing to_reader. - Implement reader-specific logic only in
IFileFormatReaderimplementations. - Keep schema inference and row parsing consistent with existing file readers.
- Preserve
IDataSourcecontract behavior (ErrorObject,ConnectionStatus,Entities,EntitiesNames).
Core API Surface
FileDataSource:Openconnection(),Closeconnection()GetEntityStructure(...),GetEntity(...), CRUD methodsResolveEntityFilePath(...)
- Reader contract:
IFileFormatReader(Configure,ReadHeaders,ReadRows,GetEntityStructure, write methods)
- Reader registry:
FileReaderFactory.Register(...)FileReaderFactory.RegisterDefaults()FileReaderFactory.GetReader(...)
Typical Usage Pattern
- Configure datasource connection (
FilePath,FileName, delimiter/props). - Ensure the reader type is registered for the target
DataSourceType. - Call
Openconnection()soFileDataSourceresolves_readerviaFileReaderFactory. - Use
GetEntityStructure/GetEntity/CRUD throughFileDataSource. - For new formats, implement
IFileFormatReader, register it, then useFileDataSourceunchanged.
Creating a Reader Extension
- Create a class under
FileManager/ReadersimplementingIFileFormatReader. - Decorate the class with
FileReaderAttributesoFileReaderRegistry.Discover()can build aFileReaderDescriptor.- Example:
[FileReader(DataSourceType.CSV, "CSV", "csv")]
public sealed class CsvFileReader : IFileFormatReader
{
public DataSourceType SupportedType => DataSourceType.CSV;
public string GetDefaultExtension() => "csv";
// ... implement contract
}
- Set
SupportedTypeandGetDefaultExtension()for the format. Keep them consistent with the attribute. - Implement
Configure(IConnectionProperties)for delimiter/encoding/flags. - Implement schema (
ReadHeaders,GetEntityStructure) and row streaming (ReadRows). - Implement write operations (
CreateFile,AppendRow,RewriteFile). - Register the reader:
- Static path:
FileReaderFactory.Register(new YourReader()) - Discovery path: ensure assembly is loaded, then run
new FileReaderRegistry(editor).Discover().
- Static path:
Validation and Safety
- Honor
ParseMode(StrictvsLenient) and populateLastDiagnostics. - Keep row streaming forward-only (
IEnumerable<string[]>) for large files. - Use atomic rewrite semantics for update/delete style operations.
- Normalize/retain source column names consistently with
FileReaderEntityHelper. - Avoid embedding format-specific logic in
FileDataSourcepartials. - If using registry discovery, do not omit
[FileReader(...)]; without it, the reader will not produce aFileReaderDescriptor.
Pitfalls
- Forgetting to register the reader before
Openconnection(). - Returning inconsistent column counts from
ReadRows. - Not handling empty files and header-less files deterministically.
- Throwing on every malformed row when
ParseMode.Lenientis selected. - Breaking
Entities/EntitiesNamesrefresh after schema inference.
File Locations
DataManagementEngineStandard/FileManager/FileDataSource.csDataManagementEngineStandard/FileManager/FileDataSource.Connection.csDataManagementEngineStandard/FileManager/FileDataSource.Schema.csDataManagementEngineStandard/FileManager/FileReaderFactory.csDataManagementEngineStandard/FileManager/Readers/IFileFormatReader.csDataManagementEngineStandard/FileManager/Readers/CsvFileReader.cs
Example
public sealed class NdjsonFileReader : IFileFormatReader
{
public DataSourceType SupportedType => DataSourceType.Json;
public bool HasHeader { get; set; } = false;
public ParseMode ParseMode { get; set; } = ParseMode.Lenient;
public IReadOnlyList<RowDiagnostic> LastDiagnostics => _diagnostics;
private readonly List<RowDiagnostic> _diagnostics = new();
public string GetDefaultExtension() => "ndjson";
public void Configure(IConnectionProperties props) { }
public void ClearDiagnostics() => _diagnostics.Clear();
public string[] ReadHeaders(string filePath) => Array.Empty<string>();
public EntityStructure GetEntityStructure(string filePath) { /* infer fields */ return null; }
public IEnumerable<string[]> ReadRows(string filePath) { /* stream json lines */ yield break; }
public string InferFieldType(string current, string rawValue) => TypeInferenceHelper.Widen(current, rawValue);
public bool CreateFile(string filePath, IReadOnlyList<string> headers) => true;
public bool AppendRow(string filePath, IReadOnlyList<string> headers, IReadOnlyList<string> values) => true;
public bool RewriteFile(string filePath, IReadOnlyList<string> headers, IEnumerable<IReadOnlyList<string>> rows) => true;
}
// Startup/bootstrap
FileReaderFactory.Register(new NdjsonFileReader());
Related Skills
Detailed Reference
Use reference.md for a complete extension checklist and implementation template.