Analyzing Math Items#

This vignette demonstrates mathipy’s feature extraction pipeline using released items from the National Assessment of Educational Progress (NAEP). mathipy includes a small sample of 5 NAEP mathematics items spanning grades 4 and 8, multiple years (2017–2024), and difficulty levels.

SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2017, 2022, and 2024 Mathematics Assessments. Items obtained from the NAEP Questions Tool.

Setup#

pip install mathipy[all]

Load Sample Data#

import csv
from mathipy.data import get_sample_csv, get_sample_image, list_sample_images

with open(get_sample_csv()) as f:
    reader = csv.DictReader(f)
    items = list(reader)

for item in items:
    print(f"{item['item_id']} | Grade {item['grade']} | {item['difficulty']} | {item['content']}")

2024-4M10 #2 | Grade 4 | Easy      | Algebra
2017-4M1 #4  | Grade 4 | Medium    | Number Properties and Operations
2024-4M13 #2 | Grade 4 | Hard      | Measurement
2022-8M1 #2  | Grade 8 | Easy      | Geometry
2017-8M3 #2  | Grade 8 | Easy      | Data Analysis, Statistics, and Probability

Available sample images:

print(list_sample_images())

['2017-4M1 #4.png', '2017-8M3 #2.png', '2022-8M1 #2.png',
 '2024-4M10 #2.png', '2024-4M13 #2.png']

Step 1: Extract Text with OCR#

Use MultimodalOCR to extract text and math expressions from item images. This requires a Gemini or OpenAI API key.

from mathipy import MultimodalOCR

ocr = MultimodalOCR(provider="gemini")
image_path = str(get_sample_image("2024-4M10 #2"))
result = ocr.extract(image_path)

print(result["full_text"])
print(result["math_expressions"])
print(result["answer_choices"])

For this example, the extracted text is:

Dwayne is D years old. Dwayne’s sister is 3 years younger than Dwayne. Which expression represents Dwayne’s sister’s age in years? A) D+3 B) D-3 C) D×3 D) D÷3

Step 2: Readability Analysis#

Analyze extracted text with math-aware normalization. LaTeX and math symbols are replaced with placeholders so they don’t inflate complexity scores.

from mathipy import ReadabilityAnalyzer

analyzer = ReadabilityAnalyzer()
texts = {
    "2024-4M10 #2": "Dwayne is D years old. Dwayne sister is 3 years younger than Dwayne. "
                     "Which expression represents Dwayne sister age in years? "
                     "A) D+3 B) D-3 C) D*3 D) D/3",
    "2017-4M1 #4":  "Divide. 228 / 4 =",
    "2022-8M1 #2":  "The seven points shown on the map represent towns. "
                     "The distances, in kilometers, along roads between towns are given. "
                     "Tamara wants to travel from her town to Brook town in the shortest distance. "
                     "On which roads should Tamara travel? "
                     "Select the appropriate roads to show your answer.",
}

for item_id, text in texts.items():
    result = analyzer.analyze(text)
    print(f"{item_id}: FK Grade={result['flesch_kincaid_grade']:.1f}, "
          f"Reading Ease={result['flesch_reading_ease']:.1f}")

2024-4M10 #2: FK Grade=1.9, Reading Ease=94.5
2017-4M1 #4:  FK Grade=1.3, Reading Ease=91.0
2022-8M1 #2:  FK Grade=5.9, Reading Ease=69.5

The grade 8 geometry item (2022-8M1 #2) has higher linguistic complexity — longer sentences describing a map with distances — while the grade 4 arithmetic item (2017-4M1 #4) is a minimal two-word prompt.

Step 3: Math Content Analysis#

Classify items by CCSSM math domain and extract math features.

from mathipy import MathContentAnalyzer

math_analyzer = MathContentAnalyzer()
for item_id, text in texts.items():
    result = math_analyzer.analyze(text)
    domain = result["domain_classification"]
    print(f"{item_id}: domain={domain['primary']} "
          f"(confidence={domain['confidence']:.2f}), "
          f"density={result['math_density']:.2f}, "
          f"terms={result['vocabulary']['math_terms']}")

2024-4M10 #2: domain=algebra    (confidence=1.00), density=0.31, terms=['expression']
2017-4M1 #4:  domain=arithmetic (confidence=1.00), density=0.20, terms=['divide']
2022-8M1 #2:  domain=arithmetic (confidence=0.50), density=0.00, terms=['point', 'even']

The algebra item is correctly classified with high confidence. The geometry item (2022-8M1 #2) has low math density because its complexity is visual (a map diagram), not textual.

Step 4: Cognitive Load Estimation#

Extract raw cognitive load indicators: element counts, operation counts, and density ratios.

from mathipy import CognitiveLoadEstimator

estimator = CognitiveLoadEstimator()
for item_id, text in texts.items():
    result = estimator.estimate(text)
    print(f"{item_id}: elements={result['numeric_elements']}, "
          f"ops={result['operation_count']}, "
          f"vars={result['variable_count']}, "
          f"density={result['element_density']:.3f}")

2024-4M10 #2: elements=1, ops=4, vars=4, density=0.172
2017-4M1 #4:  elements=2, ops=2, vars=0, density=0.400
2022-8M1 #2:  elements=0, ops=0, vars=0, density=0.000

The arithmetic item (2017-4M1 #4) has high element density (numbers and division operator packed into few words). The geometry item (2022-8M1 #2) has zero numeric elements — its complexity is in the lengthy verbal instructions, reflected by higher word and sentence counts.

Step 5: Visual Feature Extraction#

Extract image complexity features from the item screenshots.

from mathipy import VisualFeatureExtractor

extractor = VisualFeatureExtractor()
for item in items[:3]:
    image_path = str(get_sample_image(item["item_id"]))
    features = extractor.extract(image_path)
    dims = features["dimensions"]
    score = features["complexity_score"]
    shapes = features["structural_elements"]
    print(f"{item['item_id']}: "
          f"{dims['width']}x{dims['height']}, "
          f"edge_ratio={score['edge_ratio']:.3f}, "
          f"shapes={score['total_shapes']}, "
          f"lines={shapes['line_count']}")

2024-4M10 #2: 520x390,  edge_ratio=0.105, shapes=1, lines=39
2017-4M1 #4:  872x147,  edge_ratio=0.072, shapes=1, lines=0
2024-4M13 #2: (measurement item with visual content)

Combining Features#

The full pipeline produces a feature vector per item that can be used for research:

import csv
from mathipy import VisualFeatureExtractor
from mathipy.data import get_sample_csv, get_sample_image

extractor = VisualFeatureExtractor()

with open(get_sample_csv()) as f:
    items = list(csv.DictReader(f))

results = []
for item in items:
    image_path = str(get_sample_image(item["item_id"]))
    visual = extractor.extract(image_path)

    results.append({
        "item_id": item["item_id"],
        "grade": item["grade"],
        "difficulty": item["difficulty"],
        "content": item["content"],
        "edge_ratio": visual["complexity_score"]["edge_ratio"],
        "total_shapes": visual["complexity_score"]["total_shapes"],
        "lines": visual["structural_elements"]["line_count"],
    })

for r in results:
    print(f"{r['item_id']:16s} | {r['difficulty']:6s} | "
          f"edge_ratio={r['edge_ratio']:.3f} | "
          f"shapes={r['total_shapes']:3d} | lines={r['lines']:3d}")

Text-based features (readability, math content, cognitive load) can be added after OCR extraction, building a complete feature table suitable for statistical analysis.