Transformer Variational Wave Functions for Frustrated Quantum Spin Systems

The transformer architecture has become the state-of-art model for natural language processing tasks and, more recently, also for computer vision tasks, thus defining the vision transformer (ViT) architecture. The key feature is the ability to describe long-range correlations among the elements of the input sequences, through the so-called self-attention mechanism. Here, we propose an adaptation of the ViT architecture with complex parameters to define a new class of variational neural-network states for quantum many-body systems, the ViT wave function. We apply this idea to the one-dimensional J_{1}-J_{2} Heisenberg model, demonstrating that a relatively simple parametrization gets excellent results for both gapped and gapless phases. In this case, excellent accuracies are obtained by a relatively shallow architecture, with a single layer of self-attention, thus largely simplifying the original architecture. Still, the optimization of a deeper structure is possible and can be used for more challenging models, most notably highly frustrated systems in two dimensions. The success of the ViT wave function relies on mixing both local and global operations, thus enabling the study of large systems with high accuracy.

Transformer Variational Wave Functions for Frustrated Quantum Spin Systems

Viteritti, Luciano Loris;Rende, Riccardo;Becca, Federico

2023-01-01

Abstract

The transformer architecture has become the state-of-art model for natural language processing tasks and, more recently, also for computer vision tasks, thus defining the vision transformer (ViT) architecture. The key feature is the ability to describe long-range correlations among the elements of the input sequences, through the so-called self-attention mechanism. Here, we propose an adaptation of the ViT architecture with complex parameters to define a new class of variational neural-network states for quantum many-body systems, the ViT wave function. We apply this idea to the one-dimensional J_{1}-J_{2} Heisenberg model, demonstrating that a relatively simple parametrization gets excellent results for both gapped and gapless phases. In this case, excellent accuracies are obtained by a relatively shallow architecture, with a single layer of self-attention, thus largely simplifying the original architecture. Still, the optimization of a deeper structure is possible and can be used for more challenging models, most notably highly frustrated systems in two dimensions. The success of the ViT wave function relies on mixing both local and global operations, thus enabling the study of large systems with high accuracy.

Scheda breve

Scheda completa

	Anno
	
				2023
			
	Data ahead of print
	
				5-giu-2023
			
	Stato di pubblicazione
	
				Pubblicato
			
	Rivista
	
				PHYSICAL REVIEW LETTERS
			
	DOI
	
				https://dx.doi.org/10.1103/physrevlett.130.236401
			
	URL
	
				https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.130.236401
			
	Appare nelle tipologie:
	
				1.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
PhysRevLett.130.236401.pdf Accesso chiuso Tipologia: Documento in Versione Editoriale Licenza: Copyright Editore Dimensione 727.96 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	727.96 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
supp_material.pdf Accesso chiuso Descrizione: supplementary file Tipologia: Altro materiale allegato Licenza: Copyright Editore Dimensione 113.48 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	113.48 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
PhysRevLett.130.236401-Post_print.pdf accesso aperto Tipologia: Bozza finale post-referaggio (post-print) Licenza: Digital Rights Management non definito Dimensione 1.31 MB Formato Adobe PDF Visualizza/Apri	1.31 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3075239

Citazioni

1

27

26

social impact