The quality of what K-Dense Web produces depends almost entirely on what you ask for. A vague prompt gets generic output. A specific one gets you something you can actually use.

This guide covers the six elements that make a prompt work. Get them right and you'll rarely need to iterate.

The six elements of an effective prompt

Every K-Dense Web prompt should address these six areas:

Clear objective - What do you want to achieve?
Data source - Where is your data coming from?
Deliverables - What outputs do you need?
Method preferences - Any specific approaches or tools?
Target audience - Who will use the results?
Additional context - What else might help?

Here's how each one works.

1. Clear objective

K-Dense Web breaks your task into steps, and those steps are only as good as the goal you give it. Vague in, vague out.

❌ Vague objective

Analyze my data and tell me what's interesting.

✅ Clear objective

Identify the top 5 factors that predict customer churn 
in our SaaS product, and quantify the impact of each factor 
on 90-day retention rates.

Tips for writing clear objectives

Be specific about the outcome you need, not just the general topic
Include success criteria when possible (e.g., "achieve at least 85% accuracy")
State the business question you're trying to answer
Mention constraints like timeline, budget, or regulatory requirements

More examples

Vague	Clear
"Analyze sales data"	"Identify seasonal patterns in Q1-Q4 sales and forecast Q1 2027 revenue with confidence intervals"
"Help with my research"	"Conduct a systematic literature review on CRISPR delivery mechanisms, focusing on papers from 2023-2026"
"Look at this dataset"	"Build a classification model to predict loan defaults using the attached credit data, optimizing for precision to minimize false positives"

2. Clear data source

K-Dense Web works with uploaded files, public datasets, or synthetic data it generates - but you need to say which. Being vague about the source is the fastest way to get an analysis built on the wrong thing.

Data source options

Source Type	When to Use	How to Specify
Uploaded Data	You have proprietary or specific data	"Use the attached CSV file containing our customer transactions"
Public Data	Standard datasets or open sources	"Use the UCI Heart Disease dataset" or "Pull S&P 500 data from Yahoo Finance"
Synthetic Data	Prototyping, demos, or when real data isn't available	"Generate a synthetic dataset of 10,000 patient records with realistic distributions"
Web Sources	Current information needed	"Gather data from recent SEC filings for Fortune 500 tech companies"

Any format, any source

K-Dense Web can read any file format that open-source tools support. This includes:

Category	Supported Formats
Tabular Data	CSV, TSV, Excel (.xlsx, .xls), Parquet, Feather, HDF5, SQLite, JSON, XML
Documents	PDF, Word (.docx), PowerPoint (.pptx), Markdown, HTML, LaTeX, RTF
Scientific	MATLAB (.mat), SAS (.sas7bdat), Stata (.dta), SPSS (.sav), NetCDF, FITS
Geospatial	Shapefile, GeoJSON, KML, GeoTIFF, GPX
Images	PNG, JPEG, TIFF, SVG, DICOM (medical imaging)
Code & Config	Python (.py), R (.R), Jupyter (.ipynb), YAML, JSON, TOML, SQL
Compressed	ZIP, TAR, GZIP, 7z (automatically extracted)
Domain-Specific	FASTA/FASTQ (genomics), PDB (proteins), VCF (variants), and more

If Python or R can read it, K-Dense Web can work with it. Just describe what the file contains in your prompt.

Example prompts by data source

Uploaded Data:

Using the attached sales_data.xlsx file, analyze regional 
performance trends and identify underperforming territories. 
The file contains columns for date, region, product_category, 
revenue, and units_sold.

Public Data:

Using the Kaggle Titanic dataset, build a survival prediction 
model and explain which features are most important.

Synthetic Data:

Generate a realistic synthetic dataset of e-commerce transactions 
(~50,000 rows) with customer demographics, purchase history, and 
churn labels. Then build a churn prediction model using this data.

Combined Sources:

Combine our internal customer data (attached) with publicly 
available census data for demographic enrichment, then segment 
customers by predicted lifetime value.

Pro tip: describe your data

When uploading data, briefly describe what's in it:

The attached dataset (clinical_trial_results.csv) contains:
- 2,847 patient records from our Phase 2 trial
- Columns: patient_id, age, sex, treatment_arm, baseline_score, 
  week4_score, week12_score, adverse_events, dropout_flag
- Primary endpoint: change in score from baseline to week 12

This helps K-Dense Web apply the right analysis methods without guessing at structure.

3. Clear deliverables

Without clear deliverables, you might get a report when you needed a notebook, or five charts when you needed twenty. Specify what you want.

K-Dense Web can generate outputs in any format producible by open-source tools:

Output Type	Available Formats
Documents	PDF, Word (.docx), Markdown, HTML, LaTeX, RTF
Presentations	PowerPoint (.pptx), PDF slides, HTML slides (reveal.js)
Spreadsheets	Excel (.xlsx), CSV, Parquet, JSON
Visualizations	PNG, SVG, PDF (vector), interactive HTML (Plotly, Bokeh)
Code	Python scripts, Jupyter notebooks, R scripts, SQL queries
Data Exports	Any tabular format, serialized models (.pkl, .joblib), ONNX

If there's an open-source library that produces it, K-Dense Web can generate it.

Specify these details

Output type(s): Report, presentation, code, paper, figures, etc.
Quantity: How many visualizations, slides, or pages?
Format preferences: PDF, PowerPoint, Python notebook, Word doc?
Level of detail: Executive summary vs. comprehensive technical report?

Example deliverable specifications

Minimal (okay):

Generate a report with visualizations.

Better:

Generate:
1. An executive summary (1 page) with key findings
2. A detailed technical report (5-10 pages) with methodology
3. 5-7 publication-quality figures
4. The Python code used for analysis (Jupyter notebook)

Best:

Deliverables needed:
1. Executive presentation (10-12 slides, PowerPoint format) 
   for board meeting - focus on business impact
2. Technical appendix (PDF) with full statistical methodology 
   and assumptions
3. Interactive dashboard mockup showing key metrics
4. Reproducible Python code (Jupyter notebook) with comments
5. One-page summary suitable for press release

Common deliverable types

Type	Best For	Typical Specification
Report	Comprehensive analysis	"10-15 page report with executive summary"
Presentation	Stakeholder communication	"12-15 slides, suitable for non-technical audience"
Code	Reproducibility, deployment	"Jupyter notebook with documented functions"
Paper	Academic publication	"Formatted for Nature Methods, ~3000 words"
Figures	Publication, reports	"5-7 figures, 300 DPI, suitable for print"
Dashboard	Ongoing monitoring	"Interactive dashboard with key KPIs"

4. Method preferences

If your organization has standards, or your results need to comply with specific guidelines, say so up front. K-Dense Web will otherwise pick methods on its own - usually fine, but not always what you need.

When to specify methods

Regulatory requirements: "Must use FDA-accepted statistical methods"
Organizational standards: "We use scikit-learn for all ML models"
Reproducibility: "Use only packages available in our production environment"
Interpretability: "Prefer interpretable models (logistic regression, decision trees) over black-box approaches"
Specific techniques: "Apply SHAP values for feature importance"

Example method specifications

Statistical Preferences:

Use parametric tests where assumptions are met; otherwise 
fall back to non-parametric alternatives. Report effect sizes 
and confidence intervals, not just p-values.

Package Preferences:

Use pandas and scikit-learn for data processing and modeling. 
For visualization, use matplotlib and seaborn (not plotly). 
For statistical tests, use scipy.stats.

Methodology Preferences:

For the survival analysis, use Cox proportional hazards models. 
Check the proportional hazards assumption and use stratification 
if violated. Report hazard ratios with 95% CIs.

Source Preferences:

For literature review, prioritize peer-reviewed sources from 
PubMed and Google Scholar. Include preprints from bioRxiv only 
if directly relevant. Exclude sources older than 2020.

If you don't have preferences

K-Dense Web will pick methods based on your data and objective. Just say:

Use whatever methods are most appropriate for this analysis. 
Explain your methodology choices in the report.

5. Target audience

An executive summary and a technical paper covering the same analysis look completely different. Specify who's reading.

Audience dimensions to consider

Technical level: Expert, intermediate, non-technical
Role: Executive, researcher, engineer, regulator, investor
Domain familiarity: Industry expert vs. general business audience
Decision context: What decision will this inform?

Example audience specifications

Executive Audience:

Target audience: C-suite executives with limited technical 
background. Focus on business implications and ROI. Minimize 
jargon. Lead with recommendations, then supporting evidence.

Technical Audience:

Target audience: Data science team for peer review. Include 
full methodology, code, and statistical details. Assume 
familiarity with ML concepts and Python.

Regulatory Audience:

Target audience: FDA reviewers for IND submission. Follow 
ICH E9 guidelines for statistical reporting. Include all 
required tables and figures per agency guidance.

Mixed Audience:

Two audiences: (1) Executive summary for leadership - focus 
on strategic implications, (2) Technical appendix for 
engineering team - include implementation details and code.

6. Additional context

This is the catch-all: prior work, constraints, success criteria, reference files. The more relevant context you provide, the less time gets spent going in the wrong direction.

Types of additional context

Prior Work

We previously analyzed this dataset in Q2 (see attached 
Q2_analysis.pdf). Build on those findings. Don't repeat 
the exploratory analysis, focus on the predictive modeling.

Data Documentation

Attached: data_dictionary.xlsx explaining all column 
definitions and valid values. Also attached: 
study_protocol.pdf with the experimental design.

Code Files (Python, R, etc.)

Attached: preprocessing_pipeline.py - this is our current 
data cleaning code. Please use the same transformations 
for consistency. Also see feature_engineering.R for the 
derived variables we've already validated.

Reference code attached:
- baseline_model.ipynb: Our current production model (beat this)
- utils.py: Helper functions for our data format
- config.yaml: Feature definitions and thresholds we use

Reference Documents (PDFs, Papers, Reports)

Key references attached:
- smith_et_al_2024.pdf: The methodology we want to replicate
- FDA_guidance_SAMD.pdf: Regulatory requirements to follow
- competitor_whitepaper.pdf: Benchmark we need to exceed

Please review the attached materials:
- literature_review.pdf: Summary of 50 relevant papers
- domain_expert_notes.pdf: SME feedback on initial analysis
- previous_submission_feedback.pdf: Reviewer comments to address

Presentations and Slide Decks

Attached: Q3_board_presentation.pptx - this is the format 
and style leadership expects. Match this design language 
for the new presentation.

Reference slides attached:
- investor_deck_template.pptx: Use this template
- competitor_pitch.pdf: What we're positioning against
- brand_guidelines.pdf: Color palette and fonts to use

Constraints and Requirements

Constraints:
- Analysis must be reproducible with Python 3.10+
- Cannot use cloud APIs (all processing must be local)
- Results needed by Friday for board presentation
- Budget for compute: keep under 100 GPU-hours

Optimization Criteria

Optimize for:
- Precision over recall (false positives are costly)
- Model interpretability (need to explain to regulators)
- Inference speed (model will run in production at 1000 QPS)

Domain-Specific Context

Context: This is for a medical device submission. All 
statistical methods must align with FDA guidance for 
AI/ML-based Software as a Medical Device (SaMD). 
See attached FDA guidance document.

What Success Looks Like

Success criteria:
- Model AUC > 0.85 on held-out test set
- Identify at least 3 actionable feature engineering opportunities
- Generate investor-ready visualizations
- Complete analysis within 4 hours

Attachment quick reference

K-Dense Web can handle any file format readable by open-source tools. Common attachment types:

Attachment Type	Examples	Why It Helps
Tabular data	.csv, .xlsx, .parquet, .json, .sas7bdat, .dta	The actual data to analyze
Code files	.py, .R, .ipynb, .sql, .m (MATLAB)	Existing pipelines to build on or replicate
Documentation	.pdf, .docx, .md, .html	Data dictionaries, protocols, requirements
Reference papers	.pdf, .html	Methodologies to follow or replicate
Presentations	.pptx, .pdf, .key	Style templates and prior work
Config files	.yaml, .json, .toml, .ini	Feature definitions, thresholds, parameters
Images/Figures	.png, .jpg, .svg, .tiff, .dicom	Examples of desired visualization style
Scientific data	.mat, .nc, .fits, .fasta, .vcf, .pdb	Domain-specific formats (genomics, astronomy, etc.)
Geospatial	.shp, .geojson, .kml, .gpx	Geographic and mapping data
Archives	.zip, .tar.gz, .7z	Compressed collections (auto-extracted)

Don't see your format? Upload it anyway. If Python or R can read it, K-Dense Web can process it.

Putting it all together

Here's what a prompt looks like when all six elements are in place:

OBJECTIVE:
Build a predictive model for hospital readmission within 30 days 
of discharge. Identify the top risk factors and quantify their 
impact on readmission probability.

DATA SOURCE:
Using the attached patient_data.csv file containing 50,000 
discharge records from 2023-2025. Columns include demographics, 
diagnosis codes, length of stay, prior admissions, and 
readmission flag. See attached data_dictionary.xlsx for 
column definitions.

DELIVERABLES:
1. Executive summary (2 pages) for hospital leadership
2. Technical report (10-15 pages) with full methodology
3. 6-8 publication-quality figures
4. Python code (Jupyter notebook) for reproducibility
5. One-page clinical decision support guide for care managers

METHOD PREFERENCES:
- Use XGBoost or LightGBM for the primary model
- Apply SHAP values for interpretability
- Use scikit-learn for preprocessing
- Report AUC, sensitivity, specificity, and calibration metrics

TARGET AUDIENCE:
Primary: Hospital quality improvement committee (clinical 
background, limited ML expertise)
Secondary: Data science team (for technical validation)

ADDITIONAL CONTEXT:
Attachments included:
- patient_data.csv: Main dataset (50,000 records)
- data_dictionary.xlsx: Column definitions and valid values
- Q3_readmission_pilot.pdf: Prior analysis showing promising 
  results with length of stay and comorbidity count
- current_preprocessing.py: Our existing data cleaning pipeline
- cms_readmission_definitions.pdf: Official CMS methodology
- board_template.pptx: Slide format leadership expects

Additional notes:
- Optimize for sensitivity (catching high-risk patients is more 
  important than minimizing false positives)
- Must align with CMS Hospital Readmissions Reduction Program 
  definitions
- Results will inform a care management pilot program

Quick reference checklist

Before submitting your prompt, check that you've addressed:

Element	Question to Ask	Included?
Objective	What specific outcome do I need?	☐
Data Source	Where is the data coming from?	☐
Deliverables	What outputs do I need, in what format?	☐
Methods	Any required or preferred approaches?	☐
Audience	Who will use these results?	☐
Context	What else would help? Attachments? Constraints?	☐

The bottom line

Five minutes structuring your prompt saves hours of iteration. K-Dense Web handles the complexity - your job is to be clear about what you need.

You don't need all six elements every time. Simple analyses often just need an objective and a data source. The full template is for complex projects where getting it right the first time matters.

When in doubt, include more context.

Ready to try it? Start with $50 free credits →

Questions? Join our Slack community or reach out at contact@k-dense.ai.