Organize files by type

Slug: files-by-type

54127 characters 5303 words

#Rust approach

#Prerequisites

Before you begin, you need to install Rust:

  1. Visit https://rustup.rs/ to download the Rust installer.
  2. Run the installer to set up:
    • rustc (the compiler)
    • cargo (the build tool and package manager)

#Creating the Project

Open a command prompt or terminal (like PowerShell, CMD, or Git Bash) and run:

# Create a new Rust project cargo new rust_file_organizer cd rust_file_organizer # Replace default files with project files # Replace the contents of Cargo.toml with the provided configuration (see below) # Replace the contents of src/main.rs with the provided source code (see below)

#Building the Project

#Debug Build (For Development)

Use this build for development and testing.

cargo build

The executable will be created in the target/debug/rust_file_organizer.exe directory.

#Release Build (For Distribution)

Use this build for a smaller, optimized executable suitable for distribution.

cargo build --release

The executable will be created in the target/release/rust_file_organizer.exe directory.

#Running the Tool

#Getting Help

View the basic help message with available options:

.\target\debug\rust_file_organizer.exe -h

View the more detailed help message:

.\target\debug\rust_file_organizer.exe --help

(Use .\target\release\ instead of .\target\debug\ if running the release build).

#Command-Line Options

The Rust File Organizer requires two mandatory options:

  • --source <DIR> or -s <DIR>: The source directory containing files to organize.
  • --target <DIR> or -t <DIR>: The target directory where organized files will be placed into subfolders.

#Basic Options

Option Short Description Default
--source <DIR> -s Source directory to organize files from (Required)
--target <DIR> -t Target directory to place organized files (Required)
--config <FILE> -c Path to custom JSON configuration file None (uses built-in defaults)
--organize-by <ORGANIZE_BY>   Organization method (category, extension) category
--move-files   Move files instead of copying them False (copy mode)
--include-hidden -i Include hidden files and directories False
--follow-links -l Follow symbolic links during directory traversal False

#Duplicate Handling Options

These options control how the tool handles files that already exist in the target location with the same name. They conflict with each other – only use one. If none are specified, the default behavior is to overwrite.

Option Short Description Default Behavior
--timestamp-duplicates   Add timestamp to duplicate files Disabled
--skip-existing -k Skip existing files in target Disabled
--overwrite   Explicitly overwrite existing files in target Default Action

#Advanced Options

Option Short Description Default
--remove-empty-source-dirs   Remove empty directories in source after moving files (requires --move-files) False
--log-file <FILE>   Write logs to specified file (in addition to console) None (console only)
--log-level <LOG_LEVEL>   Set logging level (error, warn, info, debug, trace) info
--help -h Print the concise help message and exit. N/A
--version -V Print version information and exit. N/A

#Organization Methods

The --organize-by option accepts the following values:

  • category: Organizes files into subfolders named by category, based on file extension mapping (default).
  • extension: Organizes files into subfolders named directly by their file extension (e.g., .txt, .jpg).

#Usage Examples

(Remember to replace YourUser, C:\Path\To\Source, etc., with your actual paths. Examples use the debug build path.)

#Basic File Organization (Copy by Category)

.\target\debug\rust_file_organizer.exe --source C:\Users\YourUser\Downloads --target C:\OrganizedFiles

#Moving Files & Timestamping Duplicates

.\target\debug\rust_file_organizer.exe --source C:\Users\YourUser\Downloads --target C:\OrganizedFiles --move-files --timestamp-duplicates

#Organizing by Extension & Including Hidden Files

.\target\debug\rust_file_organizer.exe -s "C:\Path\To\Source" -t "C:\Path\To\Target" --organize-by extension --include-hidden

#Skipping Existing Files During Copy

.\target\debug\rust_file_organizer.exe -s "C:\Path\To\Source" -t "C:\Path\To\Target" --skip-existing

#Moving Files & Cleaning Up Empty Source Directories

.\target\debug\rust_file_organizer.exe -s "C:\Path\To\Source" -t "C:\Path\To\Target" --move-files --remove-empty-source-dirs

#Using a Custom Config, Overwriting, Logging to File

.\target\debug\rust_file_organizer.exe -s .\input -t .\output -c .\my_config.json --overwrite --log-file activity.log --log-level debug

#Option Conflicts and Dependencies

  • Duplicate Handling: --timestamp-duplicates, --skip-existing, and --overwrite conflict. Use at most one. Default is overwrite if none are specified.
  • Empty Directory Removal: --remove-empty-source-dirs requires --move-files to be active.

#Testing Your Installation

After building, verify it works:

# Create test directories (using PowerShell/CMD syntax) mkdir test_source test_target # Copy some test files (e.g., image.jpg, document.pdf) into test_source # Run the organizer .\target\debug\rust_file_organizer.exe -s .\test_source -t .\test_target # Check the test_target directory for organized subfolders (e.g., images, documents)

#Configuration File Format (--config)

Provide a JSON file mapping category names (strings) to lists of file extensions (strings, including the leading dot).

Example my_config.json:

{ "images": [".jpg", ".jpeg", ".png", ".gif", ".bmp", ".webp"], "documents": [".pdf", ".docx", ".doc", ".txt", ".rtf", ".odt", ".xlsx", ".xls", ".csv", ".pptx", ".ppt", ".md"], "archives": [".zip", ".rar", ".tar", ".gz", ".7z"], "code": [".rs", ".py", ".js", ".html", ".css", ".java"], "other": [] }
  • An "other" category is recommended to catch files whose extensions don’t match other categories. If organizing by category and a file doesn’t match any defined category (and no "other" category exists), it might be skipped or handled less predictably depending on implementation details. The built-in default configuration includes an “other” category.
  • If a file has no extension, it will typically be placed in a no_extension folder when organizing by extension, or potentially skipped/placed in “other” when organizing by category.

#Troubleshooting

#Common Issues

  1. Permission Errors: Ensure the application has read permissions for the entire source directory tree and write/create permissions for the target directory. Run as administrator if necessary, but be cautious.
  2. Path Not Found: Double-check that the source and target paths are spelled correctly and accessible from where you are running the command. Use absolute paths if relative paths are causing issues.
  3. Duplicate Handling: If files aren’t being handled as expected (e.g., overwritten when you wanted skipping), verify which duplicate handling flag (--skip-existing, --timestamp-duplicates, --overwrite) is active or if the default overwrite behavior is occurring.
  4. Cross-Device Move Errors: If moving files (--move-files) between different drives or partitions fails, the tool attempts a copy-then-delete fallback. Ensure there’s enough space on the target drive for the copy.
  5. Configuration File Errors: If using --config, ensure the JSON file is correctly formatted and the path is correct. Check logs for parsing errors.

#Logging for Diagnostics

Use the logging options to get more detailed information:

  • --log-level debug or --log-level trace: Provides much more detailed output about scanning, decisions, and operations.
  • --log-file <path/to/logfile.log>: Writes all logs to a file, making it easier to review extensive output.

Example for detailed logging to a file:

.\target\debug\rust_file_organizer.exe -s <source> -t <target> --log-level trace --log-file organizer.log

Review the console output and the specified log file (organizer.log in the example) for specific error messages or steps where the process might be failing.

## Cargo.toml ```toml # Cargo.toml (Fixed fern feature) [package] name = "rust_file_organizer" version = "0.1.0" edition = "2021" description = "A Rust utility for organizing files by category or extension." authors = ["AI Assistant"] # Replace with actual author license = "MIT OR Apache-2.0" # Choose appropriate license [dependencies] # Command-line argument parsing clap = { version = "4.5", features = ["derive", "cargo", "env"] } # cargo/env features allow reading from Cargo.toml/env vars # Directory traversal walkdir = "2.5" # JSON parsing for configuration files serde = { version = "1.0", features = ["derive"] } serde_json = "1.0" # Logging framework log = "0.4" # Using fern for flexible file/console logging setup # **Fix:** Enable the 'colored' feature for fern fern = { version = "0.6", features = ["colored"] } # Date and time handling for timestamps chrono = "0.4" # Flexible error handling and context reporting anyhow = "1.0" # Optional: More robust file operations (especially cross-device move) # fs_extra = "1.3" # Keep commented unless needed; std lib fallback implemented # Windows-specific dependencies are NOT needed here as std lib is used for hidden check # [target.'cfg(windows)'.dependencies] # windows = { version = "0.56", features = [...] } # Release profile optimizations (optional but recommended for smaller/faster executables) [profile.release] # 'z' optimizes aggressively for size, potentially sacrificing some speed compared to 's' or '3'. opt-level = 'z' # Enable Link-Time Optimization across crates for potential performance gains and size reduction. lto = true # Maximize optimization opportunities by using a single codegen unit (can significantly increase compile times). codegen-units = 1 # Remove symbols from the binary for smaller size. # Alternatively, use `cargo strip` (requires `cargo install cargo-strip`) or set `debuginfo = 0` (removes debug symbols only). strip = true # Abort on panic instead of unwinding the stack. Reduces binary size but prevents catching panics. panic = 'abort'

#main.rs

// src/main.rs: Rust implementation of the file organizer utility. // This version fixes compilation errors and ensures completeness. use anyhow::{bail, Context, Result}; // Use anyhow for convenient error handling use clap::Parser; // For command-line argument parsing use chrono::Local; // For generating timestamps use fern::colors::{Color, ColoredLevelConfig}; // For colored logging output use log::{debug, error, info, warn, LevelFilter}; // Logging facade use serde::Deserialize; // For deserializing JSON config use std::{ collections::{HashMap, HashSet}, env, // For CARGO_PKG_VERSION // Removed unused 'File' import fs::{self}, // Standard file system operations io::{self, ErrorKind}, path::{Path, PathBuf}, time::Instant, // For accurate duration measurement }; use walkdir::{DirEntry, WalkDir}; // For efficient directory traversal // --- Configuration Structures --- /// Represents the structure of the JSON configuration file for categories. #[derive(Deserialize, Debug, Clone)] struct CategoriesConfig(HashMap<String, Vec<String>>); /// Provides the default file categorization configuration. fn default_categories() -> CategoriesConfig { let mut map = HashMap::new(); macro_rules! add_category { ($map:expr, $name:expr, [$($ext:expr),* $(,)?]) => { $map.insert($name.to_string(), vec![$($ext.to_string()),*]); }; } add_category!(map, "images", [".jpg", ".jpeg", ".png", ".gif", ".bmp", ".webp", ".tiff", ".svg", ".ico"]); add_category!(map, "documents", [".pdf", ".docx", ".doc", ".txt", ".rtf", ".odt", ".xlsx", ".xls", ".csv", ".pptx", ".ppt", ".md", ".tex", ".chm", ".epub"]); add_category!(map, "videos", [".mp4", ".avi", ".mkv", ".mov", ".wmv", ".flv", ".webm"]); add_category!(map, "audio", [".mp3", ".wav", ".flac", ".aac", ".ogg", ".m4a"]); add_category!(map, "archives", [".zip", ".rar", ".tar", ".gz", ".bz2", ".7z", ".iso"]); add_category!(map, "code", [".py", ".java", ".c", ".cpp", ".h", ".cs", ".html", ".css", ".js", ".ts", ".jsx", ".tsx", ".xml", ".json", ".yaml", ".yml", ".sh", ".bat", ".ps1", ".rb", ".php", ".go", ".rs", ".swift", ".kt", ".ipynb", ".sql", ".toml"]); add_category!(map, "apps", [".exe", ".msi", ".apk", ".dmg", ".deb", ".rpm", ".app"]); add_category!(map, "fonts", [".ttf", ".otf", ".woff", ".woff2"]); add_category!(map, "shortcuts", [".lnk", ".url"]); map.insert("other".to_string(), vec![]); CategoriesConfig(map) } /// Loads category configuration from a specified JSON file path. fn load_config_file(config_path: Option<&PathBuf>) -> Result<CategoriesConfig> { match config_path { Some(path) if path.is_file() => { info!("Loading category configuration from: {}", path.display()); let file_content = fs::read_to_string(path) .with_context(|| format!("Failed to read config file: {}", path.display()))?; let mut config: HashMap<String, Vec<String>> = serde_json::from_str(&file_content) .with_context(|| format!("Failed to parse JSON config file: {}", path.display()))?; config.entry("other".to_string()).or_insert_with(Vec::new); info!("Successfully loaded and validated custom configuration."); Ok(CategoriesConfig(config)) } Some(path) => { warn!( "Config path '{}' provided but is not a valid file. Using default categories.", path.display() ); Ok(default_categories()) } None => { info!("No config file specified. Using default category configuration."); Ok(default_categories()) } } } /// Determines the category name (String) for a given file based on its extension. fn categorize_file(filename: &Path, categories_config: &CategoriesConfig) -> String { let extension = filename .extension() .and_then(|s| s.to_str()) .map(|s| format!(".{}", s.to_lowercase())) .unwrap_or_else(|| "no_extension".to_string()); if extension == "no_extension" { return extension; } for (category, extensions) in &categories_config.0 { if extensions.contains(&extension) { return category.clone(); } } "other".to_string() } // --- Platform Specific --- /// Checks if a file or directory is hidden on Windows using standard library features. #[cfg(windows)] fn is_hidden(path: &Path) -> Result<bool> { use std::os::windows::fs::MetadataExt; let metadata = fs::metadata(path) .with_context(|| format!("Failed to get metadata for {}", path.display()))?; let attributes = metadata.file_attributes(); // Use the raw value 0x2 for FILE_ATTRIBUTE_HIDDEN when using std::os::windows::fs const FILE_ATTRIBUTE_HIDDEN_VALUE: u32 = 0x2; Ok((attributes & FILE_ATTRIBUTE_HIDDEN_VALUE) != 0) } /// Checks if a file or directory is hidden on Unix-like systems (conventionally, starts with '.'). #[cfg(not(windows))] fn is_hidden(path: &Path) -> Result<bool> { Ok(path .file_name() .and_then(|s| s.to_str()) .map(|s| s.starts_with('.')) .unwrap_or(false)) } /// Helper for walkdir filter_entry to check hidden status. fn should_keep_entry(entry: &DirEntry, include_hidden: bool) -> bool { if include_hidden { return true; } match is_hidden(entry.path()) { Ok(hidden) => !hidden, Err(err) => { warn!( "Could not determine hidden status for {}: {}. Excluding entry.", entry.path().display(), err ); false // Exclude if check fails (safer default) } } } // --- Command Line Arguments --- #[derive(clap::ValueEnum, Clone, Debug, PartialEq, Eq)] enum OrganizeMethod { Category, Extension, } #[derive(Parser, Debug)] #[command(author, version, about = "Organize files by category or extension (Rust version).", long_about = "A Rust utility for organizing files from source directories into categorized target folders based on file types or extensions.", help_template = "{before-help}{name} {version}\n{author-with-newline}{about-with-newline}\n{usage-heading} {usage}\n\n{all-args}{after-help}" )] struct CliArgs { #[arg(short, long, value_name = "DIR")] source: PathBuf, #[arg(short, long, value_name = "DIR")] target: PathBuf, #[arg(short, long, value_name = "FILE")] config: Option<PathBuf>, #[arg(long, value_enum, default_value_t = OrganizeMethod::Category)] organize_by: OrganizeMethod, #[arg(long)] move_files: bool, #[arg(long, conflicts_with_all = ["skip_existing", "overwrite"])] timestamp_duplicates: bool, #[arg(long, short = 'k', conflicts_with_all = ["timestamp_duplicates", "overwrite"])] skip_existing: bool, #[arg(long, conflicts_with_all = ["timestamp_duplicates", "skip_existing"])] overwrite: bool, #[arg(long, short = 'i')] include_hidden: bool, #[arg(long, short = 'l')] follow_links: bool, #[arg(long, requires = "move_files")] remove_empty_source_dirs: bool, #[arg(long, value_name = "FILE")] log_file: Option<PathBuf>, #[arg(long, value_parser = clap::value_parser!(LevelFilter), default_value = "info")] log_level: LevelFilter, } // --- Core Logic --- /// Holds statistics about the file processing operation. #[derive(Debug, Default)] struct ProcessStats { /// Count of directory entries successfully yielded by the filtered WalkDir iterator. total_scanned: u64, processed: u64, skipped: u64, errors: u64, failed_files: Vec<(PathBuf, String)>, // Stores paths and error context } /// Scans the source directory for unique file extensions. fn get_all_extensions( source_directory: &Path, include_hidden: bool, follow_links: bool, ) -> Result<HashSet<String>> { let mut extensions = HashSet::new(); info!("Scanning source directory for all unique file extensions..."); let walker = WalkDir::new(source_directory) .follow_links(follow_links) .into_iter(); let mut count = 0; for entry_result in walker.filter_entry(|e| should_keep_entry(e, include_hidden)) { match entry_result { Ok(entry) => { let path = entry.path(); if path.is_file() { count += 1; if let Some(ext) = path.extension().and_then(|e| e.to_str()) { extensions.insert(ext.to_lowercase()); } if count % 1000 == 0 { debug!("Scanned {} files for extensions...", count); } } } Err(e) => warn!("Error accessing entry during extension scan: {}", e), } } info!("Found {} unique extensions.", extensions.len()); Ok(extensions) } /// Creates necessary target subfolders. fn create_target_folders( base_dir: &Path, organize_by: &OrganizeMethod, categories_config: Option<&CategoriesConfig>, all_extensions: Option<&HashSet<String>>, ) -> Result<()> { info!("Ensuring target base directory exists: {}", base_dir.display()); fs::create_dir_all(base_dir) .with_context(|| format!("Failed to create base target directory: {}", base_dir.display()))?; let folders_to_create: HashSet<String> = match organize_by { OrganizeMethod::Category => { let mut folders = categories_config .map(|cfg| cfg.0.keys().cloned().collect::<HashSet<String>>()) .unwrap_or_default(); folders.insert("other".to_string()); folders.insert("no_extension".to_string()); folders } OrganizeMethod::Extension => { let mut folders = all_extensions.cloned().unwrap_or_default(); folders.insert("no_extension".to_string()); folders } }; info!( "Creating target subfolders (mode: {:?}). Total potential folders: {}", organize_by, folders_to_create.len() ); let mut created_count = 0; for folder_name in &folders_to_create { if folder_name.is_empty() { warn!("Skipping creation of folder with empty name."); continue; } let folder_path = base_dir.join(folder_name); if !folder_path.exists() { fs::create_dir_all(&folder_path).with_context(|| { format!("Failed to create target folder '{}'", folder_path.display()) })?; debug!("Created target folder: {}", folder_path.display()); created_count += 1; } } info!("Created {} new target subfolders.", created_count); Ok(()) } /// Attempts to move a file, falling back to copy-then-delete on cross-device errors. fn move_file_with_fallback(source: &Path, target: &Path) -> io::Result<()> { match fs::rename(source, target) { Ok(_) => Ok(()), Err(rename_error) => { // Note: Relies on platform-specific OS error codes (Windows: 17, Unix: 18/libc::EXDEV). Might be brittle. let is_cross_device = || -> bool { #[cfg(windows)] { rename_error.raw_os_error() == Some(17) } #[cfg(unix)] { rename_error.raw_os_error() == Some(18) } #[cfg(not(any(windows, unix)))] { false } }; if is_cross_device() { warn!("Rename failed (cross-device error detected), attempting copy+delete fallback for move: {} -> {}", source.display(), target.display()); // Note: std::fs::copy preserves permissions but not other metadata like modification time (unlike Python's shutil.copy2). // For full metadata preservation, consider crates like `fs_extra` or platform-specific APIs. fs::copy(source, target)?; fs::remove_file(source)?; Ok(()) } else { Err(rename_error) } } } } /// Main function to perform the file organization. fn organize_files(args: &CliArgs, categories_config: &CategoriesConfig) -> Result<ProcessStats> { if !args.source.is_dir() { bail!("Source directory '{}' is invalid or not found.", args.source.display()); } if !args.target.exists() { info!("Creating target directory: {}", args.target.display()); fs::create_dir_all(&args.target).with_context(|| format!("Could not create target directory '{}'", args.target.display()))?; } else if !args.target.is_dir() { bail!("Target path '{}' exists but is not a directory.", args.target.display()); } info!("Preparing target folders..."); let extensions_for_folders = if args.organize_by == OrganizeMethod::Extension { Some(get_all_extensions(&args.source, args.include_hidden, args.follow_links)?) } else { None }; create_target_folders(&args.target, &args.organize_by, Some(categories_config), extensions_for_folders.as_ref())?; info!("Starting file processing..."); let mut stats = ProcessStats::default(); let mut file_counter = 0u64; let walker = WalkDir::new(&args.source).follow_links(args.follow_links).into_iter(); for entry_result in walker.filter_entry(|e| should_keep_entry(e, args.include_hidden)) { stats.total_scanned += 1; // Count filtered entries let entry = match entry_result { Ok(e) => e, Err(e) => { let path_display = e.path().unwrap_or_else(|| Path::new("?")).display(); error!("Error scanning path {}: {}", path_display, e); stats.errors += 1; // Provide default PathBuf if e.path() is None stats.failed_files.push(( e.path().map_or_else(|| PathBuf::from("?"), |p| p.to_path_buf()), format!("Scan error: {:?}", e), )); continue; } }; if !entry.file_type().is_file() { stats.skipped += 1; continue; } file_counter += 1; let source_path = entry.path(); let progress_prefix = format!("[{}]", file_counter); let file_result: Result<()> = (|| { let file_name = source_path.file_name().with_context(|| format!("Could not get filename for path: {}", source_path.display()))?; let target_subfolder_name = match args.organize_by { OrganizeMethod::Category => categorize_file(source_path, categories_config), OrganizeMethod::Extension => source_path.extension().and_then(|s| s.to_str()).map(|s| s.to_lowercase()).unwrap_or_else(|| "no_extension".to_string()), }; let target_folder_path = args.target.join(&target_subfolder_name); let mut target_file_path = target_folder_path.join(file_name); if target_file_path.exists() { if args.skip_existing { info!("{} Skipping (target exists): {}", progress_prefix, target_file_path.display()); stats.skipped += 1; return Ok(()); } else if args.timestamp_duplicates { let original_target_path_display = target_file_path.display().to_string(); let mut counter = 1; // Use file_name directly, it's already &OsStr let stem = source_path.file_stem().unwrap_or(file_name); let ext = source_path.extension().unwrap_or_default(); const MAX_TIMESTAMP_ATTEMPTS: u32 = 1000; loop { let timestamp = Local::now().format("%Y%m%d_%H%M%S"); let mut new_name_os = std::ffi::OsString::new(); new_name_os.push(stem); new_name_os.push(format!("_{}_{}", timestamp, counter)); if !ext.is_empty() { new_name_os.push("."); new_name_os.push(ext); } target_file_path = target_folder_path.join(&new_name_os); if !target_file_path.exists() { break; } counter += 1; if counter > MAX_TIMESTAMP_ATTEMPTS { bail!("Could not find unique timestamped name for {} after {} attempts. Skipping.", original_target_path_display, MAX_TIMESTAMP_ATTEMPTS); } } info!("{} Target exists '{}'. Renaming duplicate to: {}", progress_prefix, original_target_path_display, target_file_path.display()); } else if args.overwrite { warn!("{} Overwriting existing target file: {}", progress_prefix, target_file_path.display()); } else { warn!("{} Overwriting existing target file (default): {}", progress_prefix, target_file_path.display()); } } fs::create_dir_all(&target_folder_path).with_context(|| format!("Failed to ensure target directory '{}' exists", target_folder_path.display()))?; let operation_desc = if args.move_files { "move" } else { "copy" }; debug!("{} Attempting to {} '{}' to '{}'", progress_prefix, operation_desc, source_path.display(), target_file_path.display()); if args.move_files { move_file_with_fallback(source_path, &target_file_path).with_context(|| format!("Failed to move '{}' to '{}'", source_path.display(), target_file_path.display()))?; } else { // Note: std::fs::copy preserves permissions but not other metadata like modification time (unlike Python's shutil.copy2). // For full metadata preservation, consider crates like `fs_extra` or platform-specific APIs. fs::copy(source_path, &target_file_path).map(|_| ()).with_context(|| format!("Failed to copy '{}' to '{}'", source_path.display(), target_file_path.display()))?; } stats.processed += 1; Ok(()) })(); // End inner closure if let Err(e) = file_result { error!("{} Failed to process '{}': {:?}", progress_prefix, source_path.display(), e); stats.errors += 1; stats.failed_files.push((source_path.to_path_buf(), format!("{:?}", e))); } } // End main loop Ok(stats) } /// Recursively removes empty folders starting from the bottom up. fn remove_empty_folders(directory: &Path) -> Result<u32> { let mut removed_count = 0u32; info!("Attempting to remove empty directories within: {}", directory.display()); let mut dirs_to_check = Vec::new(); for entry_result in WalkDir::new(directory).min_depth(1) { match entry_result { Ok(entry) if entry.file_type().is_dir() => { dirs_to_check.push(entry.into_path()); } Ok(_) => {} Err(e) => warn!("Error accessing entry during empty dir scan: {}", e), } } dirs_to_check.sort_by(|a, b| b.components().count().cmp(&a.components().count())); for dir_path in dirs_to_check { if !dir_path.is_dir() { continue; } match fs::read_dir(&dir_path) { Ok(mut read_dir) => { if read_dir.next().is_none() { // Directory is empty match fs::remove_dir(&dir_path) { Ok(_) => { info!("Removed empty directory: {}", dir_path.display()); removed_count += 1; } Err(e) => { if e.kind() != ErrorKind::NotFound { warn!("Could not remove presumably empty directory '{}': {}", dir_path.display(), e); } } } } } Err(e) => { if e.kind() != ErrorKind::NotFound { warn!("Could not read directory '{}' to check emptiness: {}", dir_path.display(), e); } } } } info!("Finished removing empty directories. Removed: {}", removed_count); Ok(removed_count) } // --- Logging Setup --- /// Sets up logging using the fern crate. fn setup_logging(log_level: LevelFilter, log_file: Option<&PathBuf>) -> Result<()> { let colors = ColoredLevelConfig::new().error(Color::Red).warn(Color::Yellow).info(Color::Green).debug(Color::Blue).trace(Color::BrightBlack); let base_config = fern::Dispatch::new() .format(move |out, message, record| { out.finish(format_args!( "[{} {} {}] {}", chrono::Local::now().format("%Y-%m-%d %H:%M:%S%.3f"), colors.color(record.level()), record.target(), message )) }) .level(log_level) .level_for("hyper", LevelFilter::Warn).level_for("mio", LevelFilter::Warn).level_for("want", LevelFilter::Warn).level_for("reqwest", LevelFilter::Warn).level_for("rustls", LevelFilter::Warn); let stderr_logger = fern::Dispatch::new().filter(move |metadata| { log_level <= LevelFilter::Debug || metadata.target().starts_with(env!("CARGO_PKG_NAME")) }).chain(std::io::stderr()); let mut final_dispatch = base_config.chain(stderr_logger); let mut file_logger_ok = false; if let Some(log_path) = log_file { match fern::log_file(log_path) { Ok(file_output) => { final_dispatch = final_dispatch.chain(file_output); file_logger_ok = true; } Err(e) => { eprintln!("Error: Failed to create log file '{}': {}. Logging to console only.", log_path.display(), e); } } } final_dispatch.apply().context("Failed to set up logging")?; if file_logger_ok { if let Some(path) = log_file { info!("File logging enabled to: {}", path.display()); } } Ok(()) } // --- Main Application Entry Point --- fn main() -> Result<()> { let overall_start_time = Instant::now(); // Start timing let args = CliArgs::parse(); setup_logging(args.log_level, args.log_file.as_ref())?; info!("Rust File Organizer (v{}) starting...", env!("CARGO_PKG_VERSION")); debug!("Arguments received: {:?}", args); info!("Source directory: {}", args.source.display()); info!("Target directory: {}", args.target.display()); info!("Organization mode: {:?}", args.organize_by); info!("Operation: {}", if args.move_files { "Move" } else { "Copy" }); info!("Include hidden: {}", args.include_hidden); info!("Follow links: {}", args.follow_links); if args.skip_existing { info!("Duplicate handling: Skip existing"); } else if args.timestamp_duplicates { info!("Duplicate handling: Timestamp duplicates"); } else if args.overwrite { info!("Duplicate handling: Overwrite existing (explicitly)"); } else { info!("Duplicate handling: Overwrite existing (default)"); } let categories = load_config_file(args.config.as_ref())?; let result = organize_files(&args, &categories); match result { Ok(stats) => { if args.move_files && args.remove_empty_source_dirs && stats.processed > 0 { // Note: Cleanup errors are logged but do not cause a non-zero exit code. if let Err(e) = remove_empty_folders(&args.source) { error!("Error during empty source directory removal: {:?}", e); } } let overall_duration = overall_start_time.elapsed(); let summary = format!( "Operation completed in {:.2?}. Scanned Entries: {}, Processed Files: {}, Skipped: {}, Errors: {}", overall_duration, stats.total_scanned, stats.processed, stats.skipped, stats.errors ); info!("{}", summary); println!("\n{}", summary); if stats.errors > 0 { eprintln!("\n--- Errors occurred during processing: ---"); for (path, error_msg) in &stats.failed_files { eprintln!(" - File: {}", path.display()); eprintln!(" Error: {}", error_msg); } eprintln!("-----------------------------------------"); eprintln!("Warning: {} errors occurred. Please check logs (stderr/file) for full details.", stats.errors); // Consider exiting with non-zero status for scripting if errors occurred // std::process::exit(1); } } Err(e) => { error!("Critical error during file organization: {:?}", e); eprintln!("\nError: File organization failed critically. Check logs (stderr/file) for details."); std::process::exit(1); } } info!("Rust File Organizer finished."); Ok(()) } // --- Testing Notes --- // To properly test this application, consider using crates like: // - `assert_fs`: For creating temporary file/directory structures for tests. // - `predicates`: For making assertions about file system state (e.g., file exists, content matches). // - `assert_cmd`: For testing the command-line interface behavior, arguments, exit codes, and output. // // Example Test Scenarios (Conceptual): // - Test basic copy/move by category and extension. // - Test duplicate handling flags (skip, timestamp, overwrite) work correctly. // - Test hidden file handling with and without the --include-hidden flag. // - Test symbolic link handling with and without the --follow-links flag. // - Test behavior with empty source or target directories. // - Test custom configuration loading and verify correct categorization (including missing 'other'). // - Test empty directory removal after a successful move operation. // - Test error handling for scenarios like insufficient permissions (harder to automate reliably). // - Test long path handling specifically on Windows (requires careful test setup). // - Test cross-device move fallback behavior. // - Test handling of filenames with non-UTF8 characters (requires OsStr handling).

#python approach

#!/usr/bin/env python3 import os import shutil import argparse import json import platform import traceback import logging from datetime import datetime # --- Default Configuration --- DEFAULT_CATEGORIES_CONFIG = { "images": [".jpg", ".jpeg", ".png", ".gif", ".bmp", ".webp", ".tiff", ".svg", ".ico"], "documents": [".pdf", ".docx", ".doc", ".txt", ".rtf", ".odt", ".xlsx", ".xls", ".csv", ".pptx", ".ppt", ".md", ".tex", ".chm", ".epub"], "videos": [".mp4", ".avi", ".mkv", ".mov", ".wmv", ".flv", ".webm"], "audio": [".mp3", ".wav", ".flac", ".aac", ".ogg", ".m4a"], "archives": [".zip", ".rar", ".tar", ".gz", ".bz2", ".7z", ".iso"], "code": [".py", ".java", ".c", ".cpp", ".h", ".cs", ".html", ".css", ".js", ".ts", ".jsx", ".tsx", ".xml", ".json", ".yaml", ".yml", ".sh", ".bat", ".ps1", ".rb", ".php", ".go", ".rs", ".swift", ".kt", ".ipynb", ".sql", ".toml"], "apps": [".exe", ".msi", ".apk", ".dmg", ".deb", ".rpm", ".app"], "fonts": [".ttf", ".otf", ".woff", ".woff2"], "shortcuts": [".lnk", ".url"], "other": [] } # --- Utility Functions --- def setup_logging(log_file_path=None): """Configures logging to console and optionally to a file.""" log_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s') logger = logging.getLogger('file_organizer') logger.setLevel(logging.INFO) # Set base level # Console Handler (prints INFO and above) console_handler = logging.StreamHandler() console_handler.setFormatter(log_formatter) logger.addHandler(console_handler) # File Handler (prints INFO and above if path provided) if log_file_path: try: file_handler = logging.FileHandler(log_file_path, mode='a', encoding='utf-8') file_handler.setFormatter(log_formatter) logger.addHandler(file_handler) logger.info(f"Logging initialized. Log file: {log_file_path}") except Exception as e: logger.error(f"Failed to initialize log file handler at {log_file_path}: {e}") return logger def handle_long_path(path): """Prepends the long path prefix for Windows if necessary.""" path = os.path.abspath(path) if platform.system() == "Windows" and len(path) > 259 and not path.startswith("\\\\?\\"): path = "\\\\?\\" + path return path def load_config_file(config_path, logger): """Loads category configuration from a JSON file.""" if config_path and os.path.isfile(config_path): try: with open(config_path, "r", encoding="utf-8") as f: logger.info(f"Loading category configuration from: {config_path}") return json.load(f) except (json.JSONDecodeError, OSError) as e: logger.warning(f"Could not load or parse config file '{config_path}': {e}. Using default categories.") else: logger.info("Using default category configuration.") return DEFAULT_CATEGORIES_CONFIG def categorize_file(filename, categories_config): """Determines the category of a file based on its extension.""" _, ext = os.path.splitext(filename) ext = ext.lower() if not ext: return "no_extension" for category, extensions in categories_config.items(): if ext in extensions: return category return "other" def is_hidden_windows(filepath): """Checks if a file or directory is hidden on Windows.""" if platform.system() != "Windows": return False try: attrs = os.stat(filepath).st_file_attributes return attrs & 2 # FILE_ATTRIBUTE_HIDDEN = 2 except OSError: return False # Assume not hidden if stat fails def create_target_folders(base_dir, organize_by, categories_config, all_extensions, logger): """Creates necessary target folders before processing files.""" base_dir = handle_long_path(base_dir) logger.info(f"Ensuring target base directory exists: {base_dir}") os.makedirs(base_dir, exist_ok=True) # Ensure base exists first folders_to_create = set() if organize_by == "category": folders_to_create = set(categories_config.keys()) | {"other", "no_extension"} elif organize_by == "extension": folders_to_create = all_extensions | {"no_extension"} else: logger.error(f"Invalid organize_by option: {organize_by}") raise ValueError("Invalid organize_by option.") logger.info(f"Creating target subfolders ({organize_by})...") created_count = 0 for folder_name in folders_to_create: folder_path = os.path.join(base_dir, folder_name) if not os.path.exists(folder_path): try: os.makedirs(folder_path, exist_ok=True) created_count += 1 except OSError as e: logger.error(f"Failed to create target folder '{folder_path}': {e}") logger.info(f"Created {created_count} new target subfolders.") def get_all_extensions(source_directory, include_hidden, follow_links, logger): """Scans the source directory to find all unique file extensions.""" extensions = set() source_directory = handle_long_path(source_directory) logger.info("Scanning for all unique file extensions...") count = 0 for root, dirs, files in os.walk(source_directory, followlinks=follow_links): root_path = handle_long_path(root) if not include_hidden: dirs[:] = [d for d in dirs if not d.startswith('.') and not is_hidden_windows(os.path.join(root_path, d))] files = [f for f in files if not f.startswith('.') and not is_hidden_windows(os.path.join(root_path, f))] for file in files: count +=1 _, ext = os.path.splitext(file) if ext: extensions.add(ext[1:].lower()) if count % 1000 == 0: # Log progress for large scans logger.info(f"Scanned {count} files for extensions...") logger.info(f"Found {len(extensions)} unique extensions.") return extensions # --- Core Logic --- def sort_files( source_directory, target_directory, organize_by, timestamp_duplicates, move_files, categories_config, include_hidden, follow_links, skip_existing, logger ): """Sorts files from source to target directory based on specified options.""" source_directory = handle_long_path(source_directory) target_directory = handle_long_path(target_directory) if not os.path.isdir(source_directory): logger.error(f"Source directory '{source_directory}' is invalid or not found.") return 0, 0 if not os.path.exists(target_directory): try: os.makedirs(target_directory) logger.info(f"Created target directory: {target_directory}") except OSError as e: logger.error(f"Could not create target directory '{target_directory}': {e}") return 0, 0 elif not os.path.isdir(target_directory): logger.error(f"Target path '{target_directory}' exists but is not a directory.") return 0, 0 # --- Pre-scan and Folder Creation --- total_files = 0 files_to_process = [] logger.info("Scanning source directory to count files...") for root, dirs, files in os.walk(source_directory, topdown=True, followlinks=follow_links): root_path = handle_long_path(root) original_dirs = list(dirs) # Keep original list for iteration if needed if not include_hidden: dirs[:] = [d for d in dirs if not d.startswith('.') and not is_hidden_windows(os.path.join(root_path, d))] files = [f for f in files if not f.startswith('.') and not is_hidden_windows(os.path.join(root_path, f))] for file in files: filepath = os.path.join(root_path, file) # Basic check if it's actually a file before adding try: if os.path.isfile(filepath): files_to_process.append(filepath) total_files += 1 else: logger.warning(f"Item listed as file is not a file (skipping count): {filepath}") except OSError as e: logger.warning(f"Could not access item during scan (skipping count): {filepath} - Error: {e}") logger.info(f"Found {total_files} files to process.") if total_files == 0: logger.info("No files found to process.") return 0, 0 if organize_by == "category": create_target_folders(target_directory, "category", categories_config, None, logger) elif organize_by == "extension": all_exts = get_all_extensions(source_directory, include_hidden, follow_links, logger) create_target_folders(target_directory, "extension", None, all_exts, logger) # --- Process Files --- processed_files = 0 skipped_files = 0 error_files = 0 logger.info("Starting file processing...") for i, filepath in enumerate(files_to_process): filepath = handle_long_path(filepath) file = os.path.basename(filepath) progress_prefix = f"[{i+1}/{total_files}]" try: if not os.path.exists(filepath): # Re-check existence before processing logger.warning(f"{progress_prefix} Skipping non-existent source file: {filepath}") skipped_files += 1 continue if not os.path.isfile(filepath): # Ensure it's still a file logger.warning(f"{progress_prefix} Skipping item that is not a file: {filepath}") skipped_files += 1 continue # Determine target folder if organize_by == "category": category = categorize_file(file, categories_config) target_folder = os.path.join(target_directory, category) elif organize_by == "extension": _, ext = os.path.splitext(file) ext_folder = ext[1:].lower() if ext else "no_extension" target_folder = os.path.join(target_directory, ext_folder) else: # Should not happen logger.error(f"{progress_prefix} Invalid organization option for file {file}. Skipping.") error_files += 1 continue target_fullpath = os.path.join(target_folder, file) target_fullpath = handle_long_path(target_fullpath) # Handle existing target files if os.path.exists(target_fullpath): if skip_existing: logger.info(f"{progress_prefix} Skipping existing target: {target_fullpath}") skipped_files += 1 continue elif timestamp_duplicates: counter = 1 base, ext = os.path.splitext(file) original_target_fullpath = target_fullpath # Store for logging while os.path.exists(target_fullpath): stamp = datetime.now().strftime("%Y%m%d_%H%M%S") new_name = f"{base}_{stamp}_{counter}{ext}" target_fullpath = os.path.join(target_folder, new_name) target_fullpath = handle_long_path(target_fullpath) counter += 1 logger.info(f"{progress_prefix} Target exists '{original_target_fullpath}'. Renaming duplicate to: {target_fullpath}") else: logger.warning(f"{progress_prefix} Overwriting existing target file: {target_fullpath}") # Overwrite happens implicitly # Perform file operation try: if move_files: shutil.move(filepath, target_fullpath) # logger.info(f"{progress_prefix} Moved: {file} -> {target_folder}") # Verbose else: shutil.copy2(filepath, target_fullpath) # copy2 preserves metadata # logger.info(f"{progress_prefix} Copied: {file} -> {target_folder}") # Verbose processed_files += 1 except (OSError, shutil.Error) as e: # Catch specific shutil errors too logger.error(f"{progress_prefix} Failed to {'move' if move_files else 'copy'} '{filepath}' to '{target_fullpath}': {e}") error_files += 1 except Exception as e: # Catch any other unexpected error during file op logger.error(f"{progress_prefix} Unexpected error processing '{filepath}' -> '{target_fullpath}': {e}", exc_info=True) # Log traceback error_files += 1 except Exception as e: # Catch errors during path manipulation, categorization etc. logger.error(f"{progress_prefix} Unexpected error processing path '{filepath}': {e}", exc_info=True) error_files += 1 # Optional: Print live progress to console (can be noisy) # print(f"Progress: {i+1}/{total_files} (P: {processed_files}, S: {skipped_files}, E: {error_files})", end="\r") # Final Summary summary = f"File organization completed. Processed: {processed_files}, Skipped: {skipped_files}, Errors: {error_files}" logger.info(summary) print(f"\n{summary}") # Also print final summary to console return processed_files, skipped_files + error_files def remove_empty_folders(directory, logger): """Recursively removes empty folders starting from the bottom up.""" directory = handle_long_path(directory) removed_count = 0 logger.info(f"Attempting to remove empty directories from: {directory}") # Walk from bottom up for root, dirs, files in os.walk(directory, topdown=False): root_path = handle_long_path(root) # Consider hidden status if needed, but generally just check emptiness if not files and not dirs: # Directory is empty try: os.rmdir(root_path) logger.info(f"Removed empty directory: {root_path}") removed_count += 1 except OSError as e: # Common errors: permission denied, directory not empty (race condition?) logger.warning(f"Could not remove directory '{root_path}': {e}") except Exception as e: logger.error(f"Unexpected error removing directory '{root_path}': {e}", exc_info=True) logger.info(f"Finished removing empty directories. Removed: {removed_count}") # --- Main Execution --- def main(): parser = argparse.ArgumentParser( description="Organize files by category or extension.", formatter_class=argparse.RawTextHelpFormatter ) parser.add_argument("--source", "-s", required=True, help="Source directory containing files to organize.") parser.add_argument("--target", "-t", required=True, help="Target directory where organized files will be placed.") parser.add_argument("--config", "-c", help="Path to JSON config file for custom file categories and extensions.") parser.add_argument( "--organize-by", choices=["category", "extension"], default="category", help="Method for organizing files:\n" " category: Group into folders based on categories (default).\n" " extension: Group into folders named after file extensions." ) parser.add_argument("--move", action="store_true", help="Move files instead of copying them.") parser.add_argument("--timestamp-duplicates", action="store_true", help="Append timestamp+counter to duplicate filenames instead of overwriting/skipping.") parser.add_argument("--skip-existing", action="store_true", help="Skip processing if a file with the same name exists in the target.") parser.add_argument("--include-hidden", action="store_true", help="Include hidden files/folders (e.g., starting with '.') in processing.") parser.add_argument("--follow-links", action="store_true", help="Follow symbolic links (process target, not link). Use with caution (potential loops).") parser.add_argument("--remove-empty-source-dirs", action="store_true", help="After moving (--move must be enabled), attempt to remove empty source directories.") parser.add_argument("--log-file", help="Optional path to a file for logging progress and errors.") args = parser.parse_args() # --- Argument Validation --- if args.timestamp_duplicates and args.skip_existing: parser.error("--timestamp-duplicates and --skip-existing cannot be used together.") if args.remove_empty_source_dirs and not args.move: parser.error("--remove-empty-source-dirs requires --move to be enabled.") # --- Setup --- logger = setup_logging(args.log_file) logger.info("Script starting...") logger.info(f"Arguments: {vars(args)}") # Log arguments used categories = load_config_file(args.config, logger) # --- Execute Sorting --- processed, failed_or_skipped = sort_files( source_directory=args.source, target_directory=args.target, organize_by=args.organize_by, timestamp_duplicates=args.timestamp_duplicates, move_files=args.move, categories_config=categories, include_hidden=args.include_hidden, follow_links=args.follow_links, skip_existing=args.skip_existing, logger=logger ) # --- Optional Cleanup --- if args.move and args.remove_empty_source_dirs and processed > 0: remove_empty_folders(args.source, logger) logger.info("Script finished.") print("\nScript finished. Check console and log file (if specified) for details.") if __name__ == "__main__": main()
URL: https://ib.bsb.br/files-by-type