MCP Workflows
These workflows show how the MCP tools chain together for common growth engineering tasks. Each workflow is a conversation between you and your AI agent.
Interactive Experiment Creation
The most common workflow: you have an idea, and you want to test it properly.
Plan the experiment
Tell your agent what you want to improve. It calls plan_experiment, which returns related beliefs, past experiments, suggested metrics, and traffic allocation options.
"I want to improve signup conversion on the pricing page"
The agent presents options as a numbered list. Pick your belief, confidence level, and traffic split.
Record the reasoning
The agent calls start_reasoning with your chosen belief and confidence. This creates a belief and hypothesis in the Apex graph, establishing why you're running this test.
Log a prediction
Before seeing any results, the agent calls log_prediction so you commit to an expected outcome. This feeds the calibration loop — over time, Apex tracks how accurately your team forecasts results.
Create the experiment
The agent calls create_experiment with preview: true first. You review the summary — name, control vs. variant, traffic split, linked belief. Confirm to create.
Implement the variant
For SDK-mode experiments, the agent implements the code changes using useApexVariant. For snippet-mode, no code changes are needed — the variant is applied at runtime via the DOM.
Deploy and activate
After pushing the code, the agent calls track_deployment with the commit SHA, then check_deployment to verify it's live. Once confirmed, activate_experiment starts splitting traffic.
Decision Guardrails
Use this workflow before committing to any significant product change — even if you're not planning an experiment.
Evaluate the feature
Describe what you're considering building. The agent calls evaluate_feature, which checks the belief graph for related assumptions, scans past experiment results, and identifies confidence gaps.
"Should we add a chatbot to the homepage?"
Predict the impact
If you decide to proceed, the agent calls predict_impact to search for historical experiments with similar changes. It returns average lift, sample sizes, and a recommendation.
Decide
Based on the evidence, you can:
- Build with confidence — historical data supports the change
- Run an experiment first — signals are mixed or data is missing
- Record a belief and revisit later — you're not ready to commit engineering time
Tip
Even if you skip the experiment, recording a belief with create_belief ensures the assumption is tracked. When you revisit it later, you'll have the context for why you deferred.
Pre-Build Evaluation
Before starting a sprint or picking up a feature card, ask the agent to evaluate it:
"Evaluate this feature: adding social proof badges to the pricing page"
The agent calls evaluate_feature and returns:
- Related beliefs — what your team already assumes about this area
- Historical evidence — results from past experiments with similar changes
- Confidence gaps — untested or low-confidence assumptions that could derail the feature
- Recommendation — build, experiment first, or gather more data
This prevents the common failure mode of building features based on untested assumptions.
Reviewing Experiment Results
When an experiment has been running long enough, review it:
Check results
The agent calls get_results and presents visitor counts, conversion rates, confidence level, and days running.
Decide next steps
The agent offers options:
- Keep running — not enough data for a decision
- Promote the winner — clear result, ready to graduate
- Pause — something looks wrong, investigate
- Suggest next experiment — learn from this and iterate
Promote and clean up
If you promote a winner, the agent calls promote_winner and record_outcome. For SDK experiments, it returns code cleanup instructions — remove the useApexVariant conditional and keep only the winning variant.
Getting Smart Suggestions
When you're not sure what to test next, ask:
"What should I test next on the onboarding flow?"
The agent calls suggest_experiment with your context. It analyzes untested beliefs, low-confidence assumptions, completed experiments with follow-up potential, and returns prioritized suggestions.
Info
Suggestions improve over time. The more beliefs you record and experiments you run, the smarter the recommendations become.
Next Steps
- Browse all tools for detailed parameter reference
- Learn about beliefs to understand the reasoning layer
- Set up goals before running your first experiment