Anaphora¶
poetry_analysis.anaphora
¶
Anaphora is the repetition of the same line-initial word or phrase in a verse, or across consecutive verses in a stanza.
TODO: It can also refer to the repetition of a whole stanza-initial verse line in consecutive stanzas.
NOTE: This has not been implemented yet. This anaphora detection process is based on the repetition of the first word in each line. We will continue with implementing a grading system for how effective the figure is in each poem.
count_initial_phrases(text)
¶
Count the number of times string-initial phrases of different lengths occur in a string.
Source code in src/poetry_analysis/anaphora.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |
detect_repeating_lines(text)
¶
Detect repeating lines in a poem.
Source code in src/poetry_analysis/anaphora.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
extract_anaphora(text)
¶
Extract line-initial word sequences that are repeated at least twice.
Examples:
>>> import json
>>> text = '''
... Jeg ser paa den hvide himmel,
... jeg ser paa de graablaa skyer,
... jeg ser paa den blodige sol.
...
... Dette er altsaa verden.
... Dette er altsaa klodernes hjem.
...
... En regndraabe!
... '''
>>> result = extract_anaphora(text)
>>> print(json.dumps(result, indent=4))
{
"1-grams": {
"jeg": 3,
"dette": 2
},
"2-grams": {
"jeg ser": 3,
"dette er": 2
},
"3-grams": {
"jeg ser paa": 3,
"dette er altsaa": 2
},
"4-grams": {
"jeg ser paa den": 2
}
}
Source code in src/poetry_analysis/anaphora.py
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
extract_line_anaphora(text)
¶
Extract line initial word sequences that are repeated at least twice on the same line.
Source code in src/poetry_analysis/anaphora.py
49 50 51 52 53 54 55 56 57 58 59 | |
extract_poem_anaphora(text)
¶
Extract line-initial word sequences that are repeated at least twice in each stanza.
Source code in src/poetry_analysis/anaphora.py
116 117 118 119 120 121 122 123 124 125 126 127 128 | |
extract_stanza_anaphora(stanza, n_words=1)
¶
Gather indeces for all lines that a line-initial word repeats across successively.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_words
|
int
|
Number of words to expect in the anaphora, must be 1 or higher. If higher, a single word that is repeated more often than a phrase of n_words will be ignored in favour of the less frequent phrase. |
1
|
Source code in src/poetry_analysis/anaphora.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
filter_anaphora(stanza_anaphora)
¶
Construct and yield an annotation dictionary only for stanzas where anaphora are immediately successive.
Source code in src/poetry_analysis/anaphora.py
67 68 69 70 71 72 73 74 75 76 77 78 79 | |
find_longest_most_frequent_anaphora(phrases)
¶
Find the longest and most repeated word sequence in a counter.
Source code in src/poetry_analysis/anaphora.py
36 37 38 39 40 41 42 43 44 45 46 | |
is_successive(items)
¶
Assert whether all numbers in a list are monotonic and incremental.
Source code in src/poetry_analysis/anaphora.py
62 63 64 | |