Source code for pysips.mutation_proposal

"""
Mutation-Based Proposal Generator for Symbolic Regression Models.

This module provides a proposal mechanism for symbolic regression that uses
bingo's AGraph mutation operations to generate new candidate models from existing
ones. It is designed to work within Markov Chain Monte Carlo (MCMC) sampling
frameworks where new model proposals are needed at each step.

The module implements a configurable mutation strategy that can perform various
types of structural changes to symbolic mathematical expressions, including
adding/removing nodes, changing operations, modifying parameters, and pruning
or expanding expression trees.

Key Features
------------
- Multiple mutation types: command, node, parameter, prune, and fork mutations
- Configurable probabilities for each mutation type
- Repeat mutation capability for more dramatic changes
- Ensures non-identical proposals (prevents proposing the same model)
- Seeded random number generation for reproducible results
- Integration with bingo's ComponentGenerator for operator management

Mutation Types
--------------
Command Mutation
    Changes the operation at a node (e.g., '+' to '*')
Node Mutation
    Replaces a node with a new randomly generated subtree
Parameter Mutation
    Modifies the numeric constants in the expression
Prune Mutation
    Removes a portion of the expression tree
Fork Mutation
    Adds a new branch to the expression tree
Repeat Mutation
    Recursively applies additional mutations with specified probability

Usage Example
-------------
>>> # Create a mutation proposal generator
>>> proposal = MutationProposal(
...     X_dim=3,  # 3 input features
...     operators=["+", "subtract", "multiply", "divide"],
...     terminal_probability=0.2,
...     command_probability=0.3,
...     node_probability=0.2,
...     seed=42
... )
>>>
>>> # Use in MCMC sampling (assuming you have a model)
>>> # new_model = proposal(current_model)

Notes
-----
The proposal generator ensures that new proposals are always different from
the input model by repeatedly applying mutations until a change occurs. This
prevents MCMC chains from getting stuck with identical consecutive states.

The update() method is provided for compatibility with adaptive MCMC frameworks
but currently performs no operations, as the mutation probabilities are fixed
at initialization.
"""

import numpy as np
from bingo.symbolic_regression import (
    ComponentGenerator,
    AGraphMutation,
)


[docs] class MutationProposal: """Proposal functor that performs bingo's Agraph mutation Parameters ---------- x_dim : int dimension of input data (number of features in dataset) operators : list of str list of equation primatives to allow, e.g. ["+", "subtraction", "pow"] terminal_probability : float, optional [0.0-1.0] probability that a new node will be a terminal, by default 0.1 constant_probability : float, optional [0.0-1.0] probability that a new terminal will be a constant, by default weighted the same as a single feature of the input data command_probability : float, optional probability of command mutation, by default 0.2 node_probability : float, optional probability of node mutation, by default 0.2 parameter_probability : float, optional probability of parameter mutation, by default 0.2 prune_probability : float, optional probability of pruning (removing a portion of the equation), by default 0.2 fork_probability : float, optional probability of forking (adding an additional branch to the equation), by default 0.2 repeat_mutation_probability : float, optional probability of a repeated mutation (applied recursively). default 0.0 seed : int, optional random seed used to control repeatability """ # pylint: disable=R0913,R0917
[docs] def __init__( self, x_dim, operators, terminal_probability=0.1, constant_probability=None, command_probability=0.2, node_probability=0.2, parameter_probability=0.2, prune_probability=0.2, fork_probability=0.2, repeat_mutation_probability=0.0, seed=None, ): self._rng = np.random.default_rng(seed) component_generator = ComponentGenerator( input_x_dimension=x_dim, terminal_probability=terminal_probability, constant_probability=constant_probability, ) for comp in operators: component_generator.add_operator(comp) self._mutation = AGraphMutation( component_generator, command_probability, node_probability, parameter_probability, prune_probability, fork_probability, ) self._repeat_mutation_prob = repeat_mutation_probability
def _do_mutation(self, model): new_model = self._mutation(model) while self._rng.random() < self._repeat_mutation_prob: new_model = self._mutation(new_model) return new_model
[docs] def __call__(self, model): """ Apply mutation to generate a new symbolic expression model. This method takes a symbolic regression model (AGraph) as input and returns a new model created by applying one or more mutation operations. The method guarantees that the returned model is different from the input model by repeating mutations if necessary. Parameters ---------- model : AGraph The input symbolic regression model to be mutated. This should be a bingo AGraph instance representing a mathematical expression. Returns ------- AGraph A new symbolic regression model created by applying mutation(s) to the input model. Guaranteed to be different from the input model. Mutation Process --------------- 1. **Initial Mutation**: Applies the configured mutation operation to the model 2. **Repeat Mutations**: May apply additional mutations based on repeat_mutation_probability 3. **Difference Check**: Ensures the new model differs from the original one 4. **Repeated Attempts**: If the mutation produces an identical model, tries again Notes ----- - The mutation type applied is selected probabilistically based on the probabilities specified during initialization (command_probability, node_probability, etc.) - The repeat mutation feature allows for more dramatic changes by applying multiple mutations in sequence with probability repeat_mutation_probability - This method will always return a different model, never the same as the input See Also -------- AGraphMutation : Bingo's mutation implementation used internally """ new_model = self._do_mutation(model) while new_model == model: new_model = self._do_mutation(model) return new_model
[docs] def update(self, *args, **kwargs): """ Update method for compatibility with adaptive MCMC frameworks. This method is provided to maintain API compatibility with other proposal mechanisms that support adaptive behavior. In the current implementation, the method is a no-op as the mutation proposal does not adapt its behavior based on sampling history. Parameters ---------- *args : tuple Positional arguments (not used in the current implementation). **kwargs : dict Keyword arguments (not used in the current implementation). Returns ------- None This method does not return any value. Notes ----- Future versions might implement adaptive behavior such as: - Adjusting mutation probabilities based on acceptance rates - Learning which mutation types are more effective for a given problem In composite proposal mechanisms that combine multiple proposal types (such as RandomChoiceProposal), this method will be called as part of the update process, but currently has no effect on this proposal. """
# pass