This runs for me, but I'm still using 4.05. Maybe someone else with 4.1 can try it
election0 <- structure(list(line_text = c("Date de l'export;Code du département;Type de scrutin;Libellé du département;Code de la commune;Libellé de la commune;Inscrits;Abstentions;% Abs/Ins;Votants;% Vot/Ins;Blancs et nuls;% BlNuls/Ins;% BlNuls/Vot;Exprimés;% Exp/Ins;% Exp/Vot;Code Nuance;Sexe;Nom;Prénom;Liste;Sièges / Elu;Sièges Secteur;Sièges CC;Voix;% Voix/Ins;% Voix/Exp;",
"25/03/2014 12:50:21;01;LI2;AIN;004;Ambérieu-en-Bugey;00008198;00003422;41,74;00004776;58,26;00000191;2,33;4,00;00004585;55,93;96,00;LDVG;F;EXPOSITO;Josiane;AMBERIEU AMBITION;0;0;0;00000954;11,64;20,81;LDVG;F;PIDOUX;Catherine;VIVONS NOTRE VILLE;0;0;0;00000822;10,03;17,93;LUMP;M;FORTIN;Christophe;AMBERIEU RENOUVEAU;0;0;0;00001383;16,87;30,16;LDVD;M;FABRE;Daniel;PAROLE AUX AMBARROIS;0;0;0;00001426;17,39;31,10;",
"\n25/03/2014 12:50:21;01;LI2;AIN;007;Ambronay;00001770;00000511;28,87;00001259;71,13;00000068;3,84;5,40;00001191;67,29;94,60;LDVG;F;LEVRAT;Gisèle;AMBRONAY POUR TOUS;0;0;0;00000552;31,19;46,35;LDVD;M;FOURNIER;Gabriel;AMBRONAY Demain;0;0;0;00000178;10,06;14,95;LDVD;M;MANCUSO;Vincent;AGIR ENSEMBLE POUR L'AVENIR D'AMBRONAY;0;0;0;00000461;26,05;38,71;",
"\n25/03/2014 12:50:21;01;LI2;AIN;014;Arbent;00002167;00001061;48,96;00001106;51,04;00000176;8,12;15,91;00000930;42,92;84,09;LUMP;F;MAISSIAT;Liliane;POUR L'AVENIR DE TOUS, CONTINUONS ENSEMBLE;23;0;3;00000930;42,92;100,00;",
"\n25/03/2014 12:50:21;01;LI2;AIN;022;Artemare;00000857;00000237;27,65;00000620;72,35;00000027;3,15;4,35;00000593;69,19;95,65;LDVD;F;CHARMONT-MUNET;Mireille;AVEC VOUS, GARDONS LE CAP;13;0;2;00000386;45,04;65,09;LDVD;M;LESEUR;Philippe;ARTEMARE, UNE DYNAMIQUE POUR L'AVENIR;2;0;0;00000207;24,15;34,91;",
"\n25/03/2014 12:50:21;01;LI2;AIN;025;Bâgé-la-Ville;00002166;00001051;48,52;00001115;51,48;00000290;13,39;26,01;00000825;38,09;73,99;LDIV;M;REPIQUET;Dominique;Préparons l'avenir;23;0;5;00000825;38,09;100,00;"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
library(tidyverse)
mutate(election0,
# split by delimiter
split_text = strsplit(line_text, ";"),
# assume the first 17 elements are common
split_df = map(split_text, ~.[1:17]),
# and everything past this is repeating 11
split_names = map(split_text, ~.[-c(1:17)]),
columns = map_dbl(split_text, length),
# the number of repeating 11 name data elements
n_names = (columns - 17)/11)
my sessionInfo is
R version 4.0.5 (2021-03-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.0.0 tidyr_1.1.3 tibble_3.1.2 ggplot2_3.3.5
[9] tidyverse_1.3.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 cellranger_1.1.0 pillar_1.6.1 compiler_4.0.5 dbplyr_2.1.1 tools_4.0.5 jsonlite_1.7.2
[8] lubridate_1.7.10 lifecycle_1.0.0 gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.10 reprex_2.0.0 cli_3.0.0
[15] rstudioapi_0.13 DBI_1.0.0 haven_2.3.1 xml2_1.3.2 withr_2.2.0 httr_1.4.2 fs_1.5.0
[22] generics_0.1.0 vctrs_0.3.8 hms_1.1.0 grid_4.0.5 tidyselect_1.1.0 glue_1.4.1 R6_2.4.1
[29] fansi_0.4.1 readxl_1.3.1 tzdb_0.1.1 modelr_0.1.8 magrittr_2.0.1 backports_1.1.7 scales_1.1.1
[36] ellipsis_0.3.2 rvest_1.0.0 assertthat_0.2.1 colorspace_1.4-1 utf8_1.1.4 stringi_1.4.6 munsell_0.5.0
[43] broom_0.7.8 crayon_1.4.1
Perhaps you should check if this truncated data/code work or still crash for you.
i.e. perhaps it is fine for you on these records, but the true dataset has special characters somewhere further down that are the problem....