NAM#
- class stream_topic.NAM.DownstreamModel(trained_topic_model, target_column, dataset=None, task='regression', batch_size=128, lr=0.0005, hidden_units=None, feature_dropout=0.0, hidden_dropout=0.3, activation='relu', out_activation=None)[source]#
PyTorch Lightning module for downstream modeling using a trained topic model.
- Parameters:
trained_topic_model (AbstractModel) – Trained topic model.
target_column (str) – Name of the target column.
dataset (AbstractDataset, optional) – Dataset object (default is None).
structured_data (pd.DataFrame, optional) – Structured data (default is None).
task (str, optional) – Type of task, either ‘regression’ or ‘classification’ (default is ‘regression’).
batch_size (int, optional) – Batch size for training (default is 128).
lr (float, optional) – Learning rate for optimization (default is 0.0005).
hidden_units (List[int], optional) – List of hidden layer sizes for the Neural Additive Model (default is None).
feature_dropout (float, optional) – Dropout probability for input features (default is 0.0).
hidden_dropout (float, optional) – Dropout probability for hidden layers (default is 0.3).
activation (str, optional) – Activation function for hidden layers (default is ‘relu’).
out_activation (nn.Module, optional) – Activation function for output layer (default is None).
- trained_topic_model#
Trained topic model.
- Type:
AbstractModel
- task#
Type of task, either ‘regression’ or ‘classification’.
- Type:
str
- batch_size#
Batch size for training.
- Type:
int
- lr#
Learning rate for optimization.
- Type:
float
- loss_fn#
Loss function for the task.
- Type:
nn.Module
- structured_data#
Structured data used for downstream modeling.
- Type:
pd.DataFrame
- target_column#
Name of the target column.
- Type:
str
- combined_data#
Combined DataFrame containing structured data and topic probabilities.
- Type:
pd.DataFrame
- model#
Neural Additive Model for downstream modeling.
- Type:
NeuralAdditiveModel
- configure_optimizers()[source]#
Configure optimizer for training.
- Returns:
Optimizer.
- Return type:
torch.optim.Optimizer
- define_nam_model(hidden_units, feature_dropout, hidden_dropout, activation, out_activation)[source]#
Define the Neural Additive Model architecture.
- Parameters:
hidden_units (List[int]) – List of hidden layer sizes for the Neural Additive Model.
feature_dropout (float) – Dropout probability for input features.
hidden_dropout (float) – Dropout probability for hidden layers.
activation (str) – Activation function for hidden layers.
out_activation (nn.Module) – Activation function for output layer.
- Returns:
Initialized Neural Additive Model.
- Return type:
NeuralAdditiveModel
- forward(x)[source]#
Forward pass of the model.
- Parameters:
x (torch.Tensor) – Input tensor.
- Returns:
Output tensor.
- Return type:
torch.Tensor
- get_feature_names()[source]#
Get names of input features.
- Returns:
List of feature names.
- Return type:
List[str]
- prepare_combined_data()[source]#
Prepare combined DataFrame containing structured data and topic probabilities.
- Returns:
Combined DataFrame.
- Return type:
pd.DataFrame
- preprocess_structured_data(data)[source]#
Preprocess structured data.
- Parameters:
data (pd.DataFrame) – Structured data.
- Returns:
Preprocessed structured data.
- Return type:
pd.DataFrame