TextAnalysisR provides comprehensive AI/NLP capabilities via local
and web-based providers.
Supported Providers
| Ollama |
Local |
None |
Privacy, no cost, offline use |
| OpenAI |
Web-based |
OPENAI_API_KEY |
Quality, speed |
| Gemini |
Web-based |
GEMINI_API_KEY |
Quality, speed |
| spaCy |
Local |
None |
Linguistic analysis |
| Transformers |
Local |
None |
Embeddings, sentiment |
Feature Categories
1. Topic-Grounded Content Generation
Generate content grounded in your validated topic terms (not generic
AI knowledge):
Content types available:
-
survey_item: Likert-scale questionnaire items
-
research_question: Literature review questions
-
theme_description: Academic theme summaries
-
policy_recommendation: Actionable policy
suggestions
-
interview_question: Open-ended interview
prompts
# Generate topic labels
labels <- generate_topic_labels(
top_topic_terms,
provider = "ollama",
model = "phi3:mini"
)
# Generate survey items
items <- generate_topic_content(
topic_terms_df,
content_type = "survey_item",
provider = "openai"
)
2. Semantic Analysis & Clustering
# Generate cluster labels
cluster_labels <- generate_cluster_labels(
cluster_keywords,
provider = "auto" # Tries Ollama first, then web-based APIs
)
# RAG search over your documents
result <- run_rag_search(
query = "What are the main findings?",
documents = my_docs,
provider = "openai"
)
4. Linguistic Analysis (spaCy)
Deep linguistic processing via spaCy NLP models:
5. LLM API Access
Unified interface for all providers:
# Provider-agnostic (recommended)
response <- call_llm_api(
provider = "openai",
system_prompt = "You are a helpful assistant.",
user_prompt = "Summarize this text..."
)
# Provider-specific
call_openai_chat(system_prompt, user_prompt, model = "gpt-4o-mini")
call_gemini_chat(system_prompt, user_prompt, model = "gemini-2.0-flash")
call_ollama(prompt, model = "phi3:mini")
Responsible AI Design
All AI features follow NIST AI
Risk Management Framework principles:
| Human oversight |
AI suggests, you review and approve |
| User control |
Edit, regenerate, or override any output |
| Transparency |
View prompts and parameters used |
| Privacy |
Local options (Ollama, spaCy) for sensitive data |
| Grounding |
Content based on your data, not generic knowledge |
Setup
Local AI (Ollama)
# 1. Install Ollama: https://ollama.com
# 2. Pull a model (in terminal):
# ollama pull phi3:mini
# ollama pull llama3.1:8b
# 3. Verify in R:
check_ollama()
list_ollama_models()
Web-based AI (OpenAI/Gemini)
# Set API keys (choose one or both)
Sys.setenv(OPENAI_API_KEY = "your-openai-key")
Sys.setenv(GEMINI_API_KEY = "your-gemini-key")
# Or use .env file in project root
# OPENAI_API_KEY=your-key
# GEMINI_API_KEY=your-key
Linguistic Analysis (spaCy)
Default Models
| OpenAI |
gpt-4o-mini |
text-embedding-3-small |
| Gemini |
gemini-2.0-flash |
text-embedding-004 |
| Ollama |
phi3:mini |
- |
| Local |
- |
all-MiniLM-L6-v2 |