Quickstart Guide ================ This quickstart guide will help you get up and running with PySIPS in under 5 minutes. Installation ------------ Install PySIPS using pip: .. code-block:: bash pip install pysips Basic Usage ----------- Here's a minimal example to get you started with symbolic regression: .. code-block:: python import numpy as np from pysips import PysipsRegressor from sklearn.model_selection import train_test_split from sklearn.metrics import r2_score # Generate synthetic data: y = x^2 + noise np.random.seed(42) X = np.linspace(-3, 3, 100).reshape(-1, 1) y = X[:, 0]**2 + np.random.normal(0, 0.1, size=X.shape[0]) # Split into training and testing sets X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 ) # Create the regressor regressor = PysipsRegressor( operators=['+', '-', '*'], # Available operators max_complexity=12, # Maximum expression size num_particles=100, # Population size num_mcmc_samples=10, # MCMC steps per iteration max_time=60, # Maximum runtime in seconds random_state=42 ) # Fit the regressor regressor.fit(X_train, y_train) # Make predictions y_pred = regressor.predict(X_test) # Get the discovered expression expression = regressor.get_expression() print(f"Discovered expression: {expression}") print(f"R² score: {r2_score(y_test, y_pred):.4f}") Expected Output --------------- .. code-block:: text Discovered expression: x_0^2 R² score: 0.9987 Understanding the Parameters ----------------------------- **Essential Parameters:** - ``operators``: List of mathematical operators to use (e.g., ``['+', '-', '*', '/', 'pow']``) - ``max_complexity``: Maximum size of the expression graph (controls model complexity) - ``num_particles``: Number of particles in the SMC population (higher = better exploration) - ``random_state``: Random seed for reproducibility **Time Control:** - ``max_time``: Maximum runtime in seconds (default: no limit) - ``show_progress_bar``: Display progress during fitting (default: True) **Model Selection:** - ``model_selection``: Choose ``'mode'`` (most frequent) or ``'max_likelihood'`` (best scoring) Accessing Results ----------------- After fitting, you can access various results: .. code-block:: python # Get the best expression as a string expression = regressor.get_expression() # Get all unique models and their likelihoods models, likelihoods = regressor.get_models() print(f"Number of unique models: {len(models)}") # Make predictions on new data y_pred = regressor.predict(X_new) Next Steps ---------- - Continue to the :doc:`tutorial` for more advanced usage and examples - Explore the :doc:`api/modules` for detailed API documentation - Check out the examples in the ``demos/`` directory of the repository Common Issues ------------- **Long Runtime:** If fitting takes too long, try: - Reducing ``num_particles`` (e.g., 50-100 for quick experiments) - Reducing ``max_complexity`` (e.g., 10-15 for simpler expressions) - Setting ``max_time`` to limit the runtime **Poor Results:** If the discovered expression is not accurate, try: - Increasing ``num_particles`` (e.g., 200-500 for better exploration) - Adjusting the available ``operators`` - Increasing ``max_complexity`` if you expect more complex relationships - Running for longer (increase or remove ``max_time``)