11 2 learning rate schedulers
Detailed Learning Rate Schedulers Examples for xtorch
This document expands the "Data Handling and Optimization and Training Techniques -> Learning Rate Schedulers" subcategory of examples for the xtorch library, a C++ deep learning framework that extends PyTorch’s LibTorch API with user-friendly abstractions. The goal is to provide a comprehensive set of beginner-to-intermediate examples that introduce users to learning rate scheduler-based training tasks, with a focus on time series and graph models to align with the broader "Time Series and Graph" context. These examples showcase xtorch’s capabilities in dynamic learning rate adjustment, training efficiency, and C++ ecosystem integration, and are designed to be included in the xtorch-examples repository, helping users learn learning rate schedulers in C++.
Background and Context
xtorch simplifies deep learning for C++ developers by offering high-level model classes (e.g., XTModule, ResNetExtended, XTCNN), a streamlined training loop via the Trainer module, enhanced data utilities (e.g., ImageFolderDataset, CSVDataset), extended optimizers (e.g., xtorch::optim), and model serialization tools. The original two learning rate scheduler examples—using a step decay scheduler and cosine annealing with warm restarts—provide a solid foundation. This expansion adds six more examples to cover additional schedulers (e.g., exponential decay, ReduceLROnPlateau, cyclical, linear decay), model types (e.g., LSTM, GCN, GraphSAGE), and training scenarios (e.g., anomaly detection, graph embedding, real-time training), ensuring a broad introduction to learning rate schedulers with a focus on time series and graph applications.
The current time is 1:30 PM PDT on Monday, April 21, 2025, and all considerations are based on available information from the xtorch GitHub repository (xtorch GitHub Repository) without contradicting this timeframe.
Expanded Examples
The following table provides a detailed list of eight "Data Handling and Optimization and Training Techniques -> Learning Rate Schedulers" examples, including the original two and six new ones. Each example is designed to be standalone, with a clear focus on a specific learning rate scheduler concept or xtorch feature, making it accessible for users.
| Category | Subcategory | Example Title | Description |
|---|---|---|---|
| Data Handling and Optimization and Training Techniques | Learning Rate Schedulers | Step Decay Learning Rate Scheduler | Uses a step decay scheduler (xtorch::optim::StepLR) to adjust learning rates during training of a convolutional neural network (CNN) on the MNIST dataset (handwritten digits). Optimizes with SGD and cross-entropy loss, and evaluates with training convergence (loss curves) and test accuracy. |
| Cosine Annealing with Warm Restarts | Implements cosine annealing with warm restarts (xtorch::optim::CosineAnnealingWarmRestarts) to train an LSTM for time series forecasting on the UCI Appliances Energy Prediction dataset. Optimizes with Adam and Mean Squared Error (MSE) loss, and evaluates with generalization performance (Root Mean Squared Error, RMSE) and training stability (loss variance). |
||
| Exponential Decay Scheduler for Graph Node Classification | Applies an exponential decay scheduler (xtorch::optim::ExponentialLR) to train a Graph Convolutional Network (GCN) for node classification on the Cora dataset (citation network). Optimizes with RMSprop and cross-entropy loss, and evaluates with classification accuracy and convergence speed (epochs to converge). |
||
| ReduceLROnPlateau for Time Series Anomaly Detection | Uses the ReduceLROnPlateau scheduler (xtorch::optim::ReduceLROnPlateau) to train an autoencoder for anomaly detection on the PhysioNet ECG dataset (heart signals). Optimizes with Adagrad and MSE loss, adjusts learning rate based on validation loss, and evaluates with Area Under the ROC Curve (AUC-ROC) and training adaptability. |
||
| Cyclical Learning Rate for Molecular Graph Property Prediction | Implements a cyclical learning rate scheduler (xtorch::optim::CyclicLR) to train a graph neural network for molecular property prediction (e.g., dipole moment) on the QM9 dataset (small molecules). Optimizes with Adam and Mean Absolute Error (MAE) loss, and evaluates with prediction accuracy (MAE) and training efficiency (time per epoch). |
||
| Linear Decay Scheduler for Time Series Classification | Applies a linear decay scheduler (xtorch::optim::LambdaLR) to train a CNN for time series classification on a custom IoT sensor dataset (e.g., accelerometer data). Optimizes with Adam and cross-entropy loss, and evaluates with classification accuracy and training stability (loss curves). |
||
| Cosine Annealing for Large-Scale Graph Node Embedding | Uses a cosine annealing scheduler (xtorch::optim::CosineAnnealingLR) to train a GraphSAGE model for node embedding on the PPI dataset (protein interactions). Optimizes with Sparse Adam and unsupervised loss (e.g., graph reconstruction), and evaluates with embedding quality (downstream classification accuracy) and scalability (time per epoch). |
||
| Scheduler with Visualization for Real-Time Time Series Training | Combines a step decay scheduler (xtorch::optim::StepLR) with OpenCV to train an LSTM for real-time time series forecasting on streaming IoT sensor data (e.g., temperature readings). Visualizes learning rate and loss curves during training, optimizes with Adam and MSE loss, and evaluates with RMSE and visualization quality (clear plots). |
Rationale for Each Example
- Step Decay Learning Rate Scheduler: Introduces basic learning rate scheduling, using step decay on MNIST to teach foundational scheduler concepts, ideal for beginners.
- Cosine Annealing with Warm Restarts: Demonstrates advanced scheduling with cosine annealing, using an LSTM for time series forecasting to teach dynamic adjustment, aligning with the time series focus.
- Exponential Decay Scheduler for Graph Node Classification: Introduces exponential decay for graph models, using a GCN on Cora to teach smooth learning rate reduction, aligning with the graph focus.
- ReduceLROnPlateau for Time Series Anomaly Detection: Focuses on adaptive scheduling, using ReduceLROnPlateau on ECG data to teach loss-driven adjustment, relevant for healthcare applications.
- Cyclical Learning Rate for Molecular Graph Property Prediction: Demonstrates cyclical scheduling, using a graph neural network on QM9 to teach efficient training for graph tasks, relevant for cheminformatics.
- Linear Decay Scheduler for Time Series Classification: Introduces linear decay for stable training, using a CNN on IoT data to teach gradual learning rate reduction for time series classification.
- Cosine Annealing for Large-Scale Graph Node Embedding: Shows scalable scheduling, using cosine annealing on PPI to teach optimization for large graphs, relevant for big data applications.
- Scheduler with Visualization for Real-Time Time Series Training: Demonstrates visualization-integrated scheduling, using step decay on streaming IoT data to teach real-time training, relevant for IoT applications.
Implementation Details
Each example should be implemented as a standalone C++ program in the xtorch-examples repository, with the following structure:
- Source Code: A main.cpp file containing the example code, using xtorch’s scheduler utilities (e.g., xtorch::optim::StepLR, CosineAnnealingWarmRestarts), optimizer utilities (e.g., xtorch::optim::SGD, Adam), model modules (e.g., xtorch::nn), data utilities (e.g., CSVDataset), and, where applicable, OpenCV for visualization.
- Build Instructions: A CMakeLists.txt file to compile the example, linking against xtorch, LibTorch, and OpenCV (if needed).
- README.md: A detailed guide explaining the example’s purpose, prerequisites (e.g., LibTorch, dataset downloads, OpenCV), steps to run, and expected outputs (e.g., accuracy, RMSE, MAE, AUC-ROC, convergence speed, or visualization quality).
- Dependencies: Ensure users have xtorch, LibTorch, datasets (e.g., MNIST, UCI Appliances, Cora, PhysioNet ECG, QM9, PPI, custom IoT), and optionally OpenCV installed, with download instructions in each README. Graph datasets may require custom utilities or integration with C++ graph libraries.
For example, the “Exponential Decay Scheduler for Graph Node Classification” might include:
- Code: Train a GCN on the Cora dataset using xtorch::optim::RMSprop with an exponential decay scheduler (xtorch::optim::ExponentialLR), optimize with cross-entropy loss, load data with a custom graph dataset class, and output training loss and test accuracy, using xtorch’s modules and utilities.
- Build: Use CMake to link against xtorch and LibTorch, specifying paths to Cora dataset.
- README: Explain exponential decay scheduling for graph optimization, provide compilation and training commands, and show sample output (e.g., test accuracy of ~0.85, convergence in 40 epochs).
Why These Examples?
These examples are designed to:
- Cover Core Concepts: From basic step decay and cosine annealing to adaptive ReduceLROnPlateau, cyclical, and exponential decay schedulers, they introduce key learning rate scheduler paradigms for time series and graph applications.
- Leverage xtorch’s Strengths: They highlight xtorch’s xtorch::optim scheduler utilities, model flexibility, and C++ performance, particularly for dynamic and efficient training.
- Be Progressive: Examples start with simpler schedulers (step decay) and progress to complex ones (cyclical, ReduceLROnPlateau), supporting a learning path.
- Address Practical Needs: Techniques like adaptive scheduling, cyclical learning rates, and real-time training are widely used in real-world applications, from IoT to bioinformatics.
- Encourage Exploration: Examples like visualization-integrated scheduling and large-scale graph scheduling expose users to cutting-edge training scenarios, fostering innovation.
Feasibility and Alignment with xtorch
The examples are feasible given xtorch’s features, as outlined in its GitHub repository:
- Scheduler Utilities: xtorch::optim supports StepLR, CosineAnnealingWarmRestarts, ExponentialLR, ReduceLROnPlateau, CyclicLR, LambdaLR, and CosineAnnealingLR, enabling diverse scheduling strategies.
- Optimizer Compatibility: Schedulers integrate with xtorch::optim optimizers (e.g., SGD, Adam, RMSprop, Adagrad, Sparse Adam), supporting all examples.
- Model Compatibility: xtorch::nn modules (e.g., Conv2d, LSTM, custom graph layers) support CNNs, LSTMs, GCNs, and GraphSAGE for time series and graph tasks.
- Data Handling: xtorch::data::CSVDataset and custom dataset classes handle image, time series, and graph datasets, with support for preprocessing (e.g., normalization, feature extraction).
- Training Pipeline: The Trainer API simplifies training loops, loss computation, and scheduler updates, compatible with all examples.
- Evaluation: xtorch’s utilities support metrics like accuracy, RMSE, MAE, AUC-ROC, convergence speed, and downstream task performance.
- C++ Integration: xtorch’s compatibility with OpenCV enables real-time visualization of training progress, enhancing user interaction.
The examples align with xtorch’s goal of simplifying deep learning in C++ and fit the "Time Series and Graph" context by emphasizing time series and graph scheduling, making them ideal for the xtorch-examples repository’s learning rate schedulers section.
Comparison with Existing Practices
Popular deep learning libraries like PyTorch provide scheduler tutorials, such as “Learning Rate Scheduling in PyTorch” (PyTorch Tutorials), which cover Python-based scheduling. The proposed xtorch examples adapt this approach to C++, leveraging xtorch’s xtorch::optim scheduler utilities and C++ performance. They also include time series and graph-specific scheduling (e.g., UCI, Cora, QM9) and advanced scenarios (e.g., cyclical learning rates, real-time training) to align with the category and modern training trends, as seen in repositories like “pyg-team/pytorch_geometric” for graph model optimization (GitHub - pyg-team/pytorch_geometric).
Implementation Notes
- Directory Structure: Organize
xtorch-exampleswith adata_handling_and_optimization_and_training_techniques/learning_rate_schedulers/directory, containing subdirectories for each example (e.g.,step_decay_mnist/,cosine_annealing_timeseries_uci/). - User Guidance: The main
README.mdshould list all examples, suggest a learning path (e.g., start with step decay, then cosine annealing, then ReduceLROnPlateau), and link to xtorch’s documentation. - C++ Focus: Ensure code uses modern C++ practices (e.g., smart pointers, exception handling) and includes detailed comments for clarity.
- Dependencies: Note that users need LibTorch, xtorch, datasets (e.g., MNIST, UCI Appliances, Cora, PhysioNet ECG, QM9, PPI, custom IoT), and optionally OpenCV installed, with download and setup instructions in each README. Graph datasets may require custom utilities or integration with C++ graph libraries.
Conclusion
The expanded list of eight "Data Handling and Optimization and Training Techniques -> Learning Rate Schedulers" examples provides a comprehensive introduction to learning rate scheduler-based training with xtorch, covering step decay, cosine annealing with warm restarts, exponential decay, ReduceLROnPlateau, cyclical learning rates, linear decay, cosine annealing for large-scale graphs, and visualization-integrated scheduling. These examples are beginner-to-intermediate friendly, leverage xtorch’s strengths, and align with its goal of making deep learning accessible in C++ while addressing time series and graph applications. By including them in xtorch-examples, you can help users build a solid foundation in dynamic learning rate adjustment, fostering adoption and engagement with the xtorch community.