Introduction: How to formalize a verbal theory with VAST

Visual Argument Structure Tool (VAST) by Leising, Grenke & Cramer

Daniel Leising

Technical University Dresden

Felix Schönbrodt

Ludwig-Maximilians-Universität München

2024-10-25

Where do theories come from?

  • Falsificationism starts with a (tentative) theory, which is repeatedly tested, refuted, and refined.
  • But where does the initial theory come from? “Flair”? Flash of inspiration in the shower?
  • Popper did not care much about the first stage of creating theories - just about the later testing of theories.

Conjecture: We have a refined methodology¹ to test theories (e.g., experimental designs, statistical methods, preregistration, …). But we had (so far) no good methodology for constructing theories.

¹ “A scientific methodology is an ordered series of steps that assist a researcher in reaching a desired end state from a specified starting point.” (Borsboom et al., 2021)

Sir Karl Popper, 1902 - 1994

Visual Argument Structure Tool (VAST) V1.0

Introduction

This is a brief introduction into some of the core features of VAST. In doing this, I will simplify a few features and ignore others completely.

G X X Y Y X->Y p

Types of elements (selection)

Arrows stand for IF-THEN relationships

If X, THEN Y

G X X Y Y X->Y

Six types of relationships:

  • naming (n)
  • conceptual implication (i)
  • causation (c)
  • prediction (p)
  • transformation (t)
  • reasoning (r)

Naming concepts

G I I I_n "Intelligent" I->I_n n

G I I I_n "For example, this entire sentence here may also be used as a name in a VAST display" I->I_n n

  • The I in the box symbolizes a concept
    • A concept is an idea about a condition that may or may not apply to certain objects (e.g., people)
    • The labels of concepts (i.e., I) are abstract and basically arbitrary
  • The n stands for a naming relationship
  • A name is a word (or sentence, or paragraph) that is used to describe those objects to which a concept applies. Names (or “labels”) are set in quotation marks "" in VAST displays.

Naming concepts

Synonyms

G I I I_n "Intelligent" I->I_n n B_n "Bright" I->B_n n S_n "Smart" I->S_n n

A concept may have any number of names. Here, all three names are synonyms regarding concept I.

Naming concepts

Homonyms

G F F B_n "Big" F->B_n n T T T->B_n n

This expresses the idea that the same name is used for conditions that may not be the same. This name might thus be a homonym.

Naming concepts

Visualizing a jingle-jangle situation

G F F F_n "Famous" F->F_n n B_n "Big" F->B_n n T T T->B_n n T_n "Tall" T->T_n n

  • “Big” = jingle for F and T
  • “Big” and “Famous” = jangle for F

Data

Symbol

Note: In the original VAST paper, the Data concept has the shape on the left. As that shape is not available in the Graphviz system used to create some graphs in this presentation, we also use this this alternative symbol for Data:

Data

G T T I_n "Intelligent" T->I_n n H_n "High Test Score" T->H_n n

  • Data (left box, folder-shaped) is a special kind of concept, which explicitly refers to something observable.
    • Same as a manifest variable in Structural Equation Modeling
  • This display is supposed to capture the idea that “Intelligence is what the intelligence test measures”

Data

G HighScoreTest1 "High Score in Test 1" HighScoreTest2 "High Score in Test 2" T1 T1 T1->HighScoreTest1 n   I I T1->I r T2 T2 T2->HighScoreTest2 n   T2->I r Intelligent "Intelligent" I->Intelligent n

  • The r denotes reasoning relationships
  • This display is supposed to capture the idea that more than one test may be used as evidence for people‘s intelligence

Data

G HighScoreTest1 "High Score in Test 1" HighScoreTest2 "High Score in Test 2" T1 T1 T1->HighScoreTest1 n   T2 T2 T1->T2 p   T2->HighScoreTest2 n   I I I->T1 c I->T2 c Intelligent "Intelligent" I->Intelligent n

  • The p stands for a prediction
  • The c stands for causal effects

Data

G HighScoreTest1 "High Score in Test 1" HighScoreTest2 "High Score in Test 2" T1 T1 T1->HighScoreTest1 n   T2 T2 T1->T2 p   T2->HighScoreTest2 n   I I I->T1 c I->T2 c Intelligent "Intelligent" I->Intelligent n

This display is supposed to capture the idea that …

  • The same thing has a causal effect on the outcomes of both tests
  • Therefore, the outcomes of the two tests correlate with one another

Splitting up the display elements 1

G T1 T1 HighScoreTest1 "High Score in Test 1" T1->HighScoreTest1 n   T2 T2 HighScoreTest2 "High Score in Test 2" T2->HighScoreTest2 n   I I Intelligent "Intelligent" I->Intelligent n

Readability may often be promoted by setting naming relationships aside.

Splitting up the display elements 2

G T1 T1 HighScoreTest1 "High Score in Test 1" T1->HighScoreTest1 n   T2 T2 HighScoreTest2 "High Score in Test 2" T2->HighScoreTest2 n   I I Intelligent "Intelligent" I->Intelligent n

G T1_a T1 T2_a T2 T1_a->T2_a p   T1_b T1 T2_b T2 I I I->T1_b c I->T2_b c

Higher-order concepts 1

Any combination of elements may be made into a new „higher-order concept“ by putting it into a frame:

Higher-order concepts 2

Higher order concepts may be related to one another in exactly the same ways in which „simple“ concepts may be:

This diagram expresses the idea that a correlation between the two sets of test scores would be a reason (r) to assume that both sets of test scores reflect causal influences of some common factor (called “intelligence”).

IS and OUGHT

The IS element tells us that a concept is not a mere possibility, but is considered to actually apply:

  • I is causally affected by GCE
  • I is causally affected by GGD

The analyst

The “analyst” heading names the person who made the VAST display. A VAST-display always reflects this person‘s view:

No other causal paths?

  • A c 0 arrow expresses the idea that there is no other causal influence on “Intelligence” apart from the ones that were specified (CGE or GGD).
  • Mira’s display now contains two ideas that are mutually incompatible. This is not a problem: VAST is used for displaying ideas correctly – those ideas themselves may still be nonsensical or incoherent!

Perspectives

Here, Mira expresses the idea, that Robert holds the view on the left-hand side, whereas Ayşe holds the view on the right-hand side. Now the the display is not logically incoherent anymore.

Strength of relationship

  • Default interpretation of a directional arrow XY: this relationship is considered relevant and positive (i.e., the more X the more Y)
  • Relationship strength of arrows can be expressed with a number between -1 (perfect negative relationship), 0 (no relationship), and 1 (perfect positive relationship)
    • If no number is explicitly specified: default relationship strength is “> 0”
  • Relationship strength of IS and OUGHT can be expressed with a number between 0 and 1
    • If no number is explicitly specified: default relationship strength is “> 0.5”
  • The strength of a relationship can also be expressed in a vague verbal way, e.g., “strong”, “weak”, “negative”, …
  • The relationship strength can be applied to any relationship type:
    • type p: “wearing glasses” makes it 70 percent likely for a person to also be “smart”
    • type r: It is 90 percent reasonable to assume someone “is in love with you” when that person “giggles a lot while talking to you”
    • type c: being “obese” makes it 50 percent likely for someone to develop “Diabetes Type II” as a consequence

Exercise: How would you interpret this?

Exercise: How would you interpret this?

(A): “Mira assumes that she has presented Robert’s and Ayse’s views correctly (with strength > 0.5, which is the default for IS and OUGHT elements). This implies that these two persons have different and incompatible statements about the single causal effect on I.

Note: The implication that Robert’s and Ayse’s views are incompatible is not explicitly represented in the display, and needs to be figured out by the reader. One could attach a naming relationship to the higher-order concept box that makes this explicit, e.g., by naming it “Incompatible views of Robert and Ayse”.

(B): Mira says (normatively) that Robert and Ayse should get along well.

(C): Mira is certain (relationship strength = -1) that the big box below (i.e., the incompatible views) causally leads to Robert and Ayse not getting along well.

Fuzziness as a feature

  • The goal are precise theories - but in reality, they often are (at least partially) vague and fuzzy
  • VAST allows to make this fuzziness explicit by allowing vague quantifiers of relationship strength, e.g. “weakly positive”, “strong”, “>0.5”
    • Better than assigning an arbitrary, pseudo-precise number
    • It’s about formalizing the current state/precision of a verbal theory - if it is imprecise, let’s show it!
  • Show gaps, underspecifications, and inconsistencies in the theory

Notes on the different types of relationships

  • Relationships between constructs can be …
    • on a theoretical level. Then it’s typically a causal path (c)
    • on empirical level: A relationship has been found! This is a prediction path (p)
    • Research cycle: One author’s exploratory p-path is the next author’s theoretical c-path
  • Consider what you want to formalize:
    • A certain author’s theoretical point of view at a certain point in time?
    • Or your personal, current point of view?
    • The empirical support for a statement?

End

Contact

CC-BY-SA 4.0