space. In International Conference on Learning Rep-
resentations.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and
Dean, J. (2013b). Distributed representations of words
and phrases and their compositionality. Advances in
neural information processing systems, 26.
Mikolov, T., Yih, W.-t., and Zweig, G. (2013c). Linguis-
tic regularities in continuous space word representa-
tions. In Proceedings of the 2013 Conference of the
North American chapter of the Association for Com-
putational Linguistics: Human Language Technolo-
gies, pages 746–751.
Penedo, G., Malartic, Q., Hesslow, D., Cojocaru, R., Cap-
pelli, A., Alobeidli, H., Pannier, B., Almazrouei,
E., and Launay, J. (2023). The RefinedWeb dataset
for Falcon LLM: outperforming curated corpora with
web data, and web data only. arXiv preprint
arXiv:2306.01116.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S.,
Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020).
Exploring the limits of transfer learning with a uni-
fied text-to-text transformer. The Journal of Machine
Learning Research, 21(1):5485–5551.
Shaeffe, K. (2020). Far more americans see ‘very strong’
partisan conflicts now than in the last two presidential
election years. Pew Research Center.
Sheikh Ali, Z., Mansour, W., Haouari, F., Hasanain, M.,
Elsayed, T., and Al-Ali, A. (2023). Tahaqqaq: a real-
time system for assisting twitter users in arabic claim
verification. In Proceedings of the 46th international
ACM SIGIR conference on research and development
in information retrieval, pages 3170–3174.
Sun, Y., He, J., Lei, S., Cui, L., and Lu, C.-T. (2023). Med-
mmhl: A multi-modal dataset for detecting human-
and llm-generated misinformation in the medical do-
main. arXiv preprint arXiv:2306.08871.
Taylor, R., Kardas, M., Cucurull, G., Scialom, T.,
Hartshorn, A., Saravia, E., Poulton, A., Kerkez, V.,
and Stojnic, R. (2022). Galactica: A large language
model for science. arXiv preprint arXiv:2211.09085.
Thorne, J., Vlachos, A., Christodoulopoulos, C., and Mittal,
A. (2018). FEVER: a large-scale dataset for fact ex-
traction and verification. In Proceedings of the 2018
Conference of the North American Chapter of the As-
sociation for Computational Linguistics: Human Lan-
guage Technologies, Volume 1, pages 809–819.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi,
A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava,
P., Bhosale, S., Bikel, D., Blecher, L., Ferrer, C. C.,
Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu,
J., Fu, W., Fuller, B., Gao, C., Goswami, V., Goyal,
N., Hartshorn, A., Hosseini, S., Hou, R., Inan, H.,
Kardas, M., Kerkez, V., Khabsa, M., Kloumann, I.,
Korenev, A., Koura, P. S., Lachaux, M.-A., Lavril, T.,
Lee, J., Liskovich, D., Lu, Y., Mao, Y., Martinet, X.,
Mihaylov, T., Mishra, P., Molybog, I., Nie, Y., Poul-
ton, A., Reizenstein, J., Rungta, R., Saladi, K., Schel-
ten, A., Silva, R., Smith, E. M., Subramanian, R.,
Tan, X. E., Tang, B., Taylor, R., Williams, A., Kuan,
J. X., Xu, P., Yan, Z., Zarov, I., Zhang, Y., Fan, A.,
Kambadur, M., Narang, S., Rodriguez, A., Stojnic, R.,
Edunov, S., and Scialom, T. (2023). Llama 2: Open
foundation and fine-tuned chat models. arXiv preprint
arXiv:2307.09288.
Van der Maaten, L. and Hinton, G. (2008). Visualizing data
using t-sne. Journal of machine learning research,
9(11).
Wach, K., Duong, C. D., Ejdys, J., Kazlauskait
˙
e, R.,
Korzynski, P., Mazurek, G., Paliszkiewicz, J., and
Ziemba, E. (2023). The dark side of generative arti-
ficial intelligence: A critical analysis of controversies
and risks of chatgpt. Entrepreneurial Business and
Economics Review, 11(2):7–24.
Yang, R., Ma, J., Lin, H., and Gao, W. (2022). A weakly
supervised propagation model for rumor verification
and stance detection with multiple instance learning.
In Proceedings of the 45th International ACM SIGIR
Conference on Research and Development in Informa-
tion Retrieval, pages 1761–1772.
Yao, B. M., Shah, A., Sun, L., Cho, J.-H., and Huang, L.
(2023). End-to-end multimodal fact-checking and ex-
planation generation: A challenging dataset and mod-
els. In Proceedings of the 46th International ACM
SIGIR Conference on Research and Development in
Information Retrieval, pages 2733–2743.
Zhou, J., Zhang, Y., Luo, Q., Parker, A. G., and De Choud-
hury, M. (2023). Synthetic lies: Understanding AI-
generated misinformation and evaluating algorithmic
and human solutions. In Proceedings of the 2023 CHI
Conference on Human Factors in Computing Systems,
pages 1–20.
SECRYPT 2024 - 21st International Conference on Security and Cryptography
248