Chemical Profile and Chemometric Analysis of Genetically Modified Soybeans Produced in the Triângulo Mineiro Region (MG), Brazil

Soy production in Brazil is an important factor for the agro-industrial, economic, and social development of the country. The expansion of soy in the Brazilian territory is mainly due to the incorporation of new genetic characteristics into cultivars that granted resistance to the Cerrado conditions and to herbicides. Currently, Brazilian soy production is the result of genetically modified cultivars. Studies regarding the chemical composition of soybeans show that qualitative and quantitative variations can occur, depending on the region of production. This work aimed to investigate the chemical composition of soybeans produced in different cities of the Triângulo Mineiro region/MG, Brazil (Harvest 2017/2018) and stored in three warehouses located in the city of Uberaba/MG. The grain analysis was made by liquid chromatography coupled to electrospray ionization mass spectrometry (LC-MS-ESI). The classes of metabolites identified from methanolic extraction were organic acids, phenolic Journal of Agricultural Studies ISSN 2166-0379 2021, Vol. 9, No. 2 http://jas.macrothink.org 74 compounds, flavonoids, sugars, amino acids, dipeptides, nitrogenous bases, nucleosides, sphingolipids, and fatty acids. The isoflavones genistein, daidzein, glycitein, genistin, acetyldaidzin, and acetylgenistin were identified in soybeans from the three warehouses. The flavonoid eriodictyol-O-hexoside was also found. The Principal Component Analysis (PCA) from the mass spectrum data obtained by direct injection in the negative and positive modes evidenced the well-defined separation of three groups, indicating that there was variance among the soy samples from each warehouse. The samples from warehouses 1 and 3 showed greater similarity in the Hierarchical Cluster Analysis (HCA) in negative mode, while in positive mode, the samples from warehouses 2 and 3 presented greater similarity.


Introduction
The global production of soybean (Glycine max) for the 2017/2018 harvest was around 342 million tons, with a planted area of 124.5 million hectares. The United States ranked first with a production of 120 million tons (USDA, 2020), followed by Brazil with about 119 million grains, representing 34.7% of the world production . Brazilian production in the 2018/2019 harvest was approximately 115 million tons, remaining as the second largest world producer . The projection for the 2019/2020 harvest is that these two countries will remain world leaders in soy production (USDA, 2020).
Soy is a key component of the process of agro-industrial development in Brazil and its remarkable current production was made possible by the development of genetically modified varieties adapted to the Cerrado conditions, and resistant to herbicides. Genetically modified cultivars are used almost completely in the soybean area planted in Brazil (Dall'agnol, 2016). Among the many producing regions, the State of Minas Gerais, located in Southeast Brazil, contributes with 9% of the total production, being the largest producer in that region (USDA, 2020;. It is estimated that Brazil will export about 75 million tons in 2020, and 44.5 million tons will be used in grinding  for the production of soybean meal and oil. Soybean oil is used mainly for human consumption and for the manufacture of biodiesel, while the bran is used for the production of animal feed and industrial human food . Soy has several nutritional properties, being considered a very versatile food rich in essential nutrients such as lipids, proteins, fibers, vitamins, and minerals (Jooyandeh, 2011). Various products and by-products for human consumption can originate from soy, such as flour, butter, cheese, soy flakes, dietary foods, textured protein, soy-burgers, soy-based infant formulas, supplements, soy milk and soy-based drinks, and veggie food, among others. In order to meet a growing demand in human food, soy-based products have been developed in order to provide new healthier foods with a more attractive taste (He & Chen, 2013;Rizzo & Baroni, 2018).
In this perspective, the consumption of soy-based foods has been increasing in recent years due to their benefits for human health. These benefits are often related to bioactive substances present in soybeans, mainly phenolic compounds (Verardo et al., 2015). This class of metabolites is represented in the soy by phenolic acids and flavonoids. Among flavonoids, isoflavones are the most studied components. These compounds are related to the antioxidant properties of soy (Nam et al., 2014) and with other biological properties like anticancer potential (Ko et al., 2014;Mahmoud et al., 2014) and estrogenic effect (Zaheer & Akhtar, 2015). Soy is one of the richest food sources of isoflavones (Mahmoud et al., 2014) and genistein, daidzein, genistin, daidzin and glycitein, are the most commonly found compounds (Cavaliere et al., 2007;Lee et al., 2008;Dueñas et al., 2012;Mahmoud et al., 2014;Verardo et al., 2015).
Although Brazil is one of the major world producers and considering the economic importance of the various products that are obtained from soybean processing, few studies are found in the literature analyzing the chemical constitution of the organic metabolites of the Brazilian soy, particularly soy produced in the Triâ ngulo Mineiro region -Minas Gerais, Brazil. Thus, the objective of this work was to investigate the chemical constituents of soybeans from the 2017/2018 harvest, produced and stored in the Triâ ngulo Mineiro region, an important soybean production center in the State of Minas Gerais, Brazil. The data presented here represents the first information generated for this region, which highlights the importance of this study.

Study Area
The genetically modified soybeans used in this study were produced in the Triâ ngulo Mineiro region/ MG -Brazil (2017, and the area is outlined in Figure 1. The collections of the soy samples were carried out in three different warehouses located in the city of Uberaba/MG, which receives the main volume of soy produced in this region.

Warehouses
Three warehouses were selected in the Uberaba region from which to collect the soybean samples. These warehouses were chosen because they receive a considerable volume of genetically modified soybeans from different producers in Uberaba and the nearby region. The selected warehouses, according to latitude and longitude coordinates are called herein warehouse 1 (Lat:-19.707872, Long: -47.979147), warehouse 2 (Lat:-19.700648, Long:-47.977304), and warehouse 3 (Lat:-19.702434, Long:-47.972029).

Sample Collection and Preparation
The sampling was conducted in accordance with the Normative Instruction N o . 60, article 17 of MAPA, from 12/22/2011, with modifications (MAPA, 2011. Samples of soybeans were collected from the warehouses 1, 2, and 3, at the beginning of the harvest (03/20/18 -03/29/18). After receiving the material, the impurities (0 -1% impurities) were removed, and then, the grains were transferred to a grain dryer, where they remained at 110 o C, until reaching the moisture content required for storage (13 -14%). Then, the collection was carried out at the outlet of the grain dryer, through the hole used to measure the moisture content. A mass of approximately 840 g was collected every 5 minutes, totaling 10 kg per day. At the end, the three samples collected from each warehouse were grouped and homogenized. Then, they were stored in an ultra-freezer until the moment of analysis. During the three days of collection, the receipt of soybeans from all the cities outlined in Figure 1 was also recorded.

Soybean Methanol Extraction
Ground soybeans (5.0 g) from each of the warehouses were weighed in 50 mL beakers, followed by the addition of 15 mL of spectroscopic methanol. Extractions were performed in an ultrasound bath for a period of 30 minutes. The supernatant was first filtered in C18 cartridges for glass syringes containing Millipore filters (0.45 µm), and again filtered into vials (2 mL). This procedure was performed in triplicate for each warehouse. Subsequently, the samples of the vials were analyzed by liquid chromatography coupled to mass spectrometry with electrospray ionization (LC-MS-ESI).
The extractions for the mass spectrometry experiments, by direct injection, followed the same procedure for the LC-MS-ESI experiments. However, ten extractions for each warehouse were carried out. These extractions were done to obtain the grain mass spectra, targeting the chemometric analysis.

Liquid Chromatography Coupled to Electrospray Ionization Mass Spectrometry (LC-MS-ESI)
The soybean analyzes were performed on an Agilent® Infinity 1260 UHPLC coupled to a high resolution mass spectrometer (Agilent ® 6520 B) Quadrupole Time of Flight (Q-TOF) with an electrospray ionization source (ESI).
The mass spectrometer operated with a nebulizing pressure of 20 PSI, drying gas flow of 8.0 L min -1 , and a temperature of 220 °C, with 4.5 kV energy in the capillary. The measures were taken in both positive and negative modes, obtaining [M+H] + and [M-H]ions, respectively. The obtained masses were in high resolution (MS) and the proposed molecular formula followed the lowest difference between the experimental mass and the theoretical mass, double bond equivalence, and nitrogen rule. The error in ppm was calculated according to the equation: Eppm = [experimental mass -exact mass/exact mass] . 10 6 . Sequential mass spectrometry (MS 2 ) was performed at different collision energies for the positive and negative modes. The structure suggestions were proposed comparing the fragment mass spectra and the high-resolution masses with other works in the literature and Metlin database.

Mass Spectrometry by Direct Injection
Mass spectrometry by direct injection was used to compare the chemical composition of soybeans between the warehouses. The analyzes were performed on a mass spectrometer Agilent® 6520 Quadrupole Time of Flight (Q-TOF) with an electrospray ionization source (ESI) operating in positive and negative modes. The methanolic solutions were injected directly into the mass spectrometer at a flow of 0.2 mL min -1 . The general conditions of the equipment were as the follows: drying gas temperature of 220 o C, capillary voltage of 4.5 KV, and 65 V of cone voltage. The mass spectra were measured in scan mode with a ratio/charge (m/z) of 100 to 1,000u. The ion masses in high resolution were obtained in positive and negative modes, and presented as [M+H] + or [M-H] -. A total of 10 samples of soybeans from each warehouse were extracted with methanol and analyzed by mass spectrometry through direct injection.

Statistical Analysis
The average of ten analyzes for the soybeans in each warehouse were obtained using the MassHunter Workstation Qualitative Analysis software (Agilent®). The ions in each sample were obtained for the positive and negative modes. Although the data were obtained in a high resolution mass analyzer, only the integer values of the mass/charge ratio (m/z) were considered. The data were exported to spreadsheets as a table of mass/charge ratio (m/z) and absolute abundance. The files were grouped into three different folders: soybeans from Warehouse 1, soybeans from Warehouse 2, and soybeans from Warehouse 3.
The data, in both modes, were transported to the MATLAB environment, version R2015a, standardized, and the ion signals with an intensity lower than or equal to 5% of the maximum abundance were removed, resulting in the matrix for analysis. The data centered on the mean were submitted to unsupervised exploratory analysis of PCA (Principal Component Analysis) (Hotelling, 1933) and HCA (Hierarchical Cluster Analysis) (Bridges Junior, 1966). The models were built using PLS_Toolbox, version 8.62.

Identification of Compounds by LC-MS-ESI
The chemical profile of soybeans from warehouses 1, 2, and 3, in the negative and positive modes, are represented in the base-peak chromatograms (BCP), shown in Figure 2. From the chromatogram analyses obtained by LC-MS-ESI, in the negative and positive modes, it was possible to infer that the soybeans stored in the three warehouses (Harvest 2017/2018) presented similarities regarding their chemical composition. This was confirmed when the chemical composition of the three warehouses was investigated.  Table 1 shows the ions in negative and positive modes, error in ppm, fragmentation (MS 2 ), molecular formula, and the attempt to identify the compounds present in the soybean methanol extracts produced in the Triâ ngulo Mineiro region and stored at three different warehouses in the city of Uberaba/MG. According to LC-MS-ESI, no qualitative variations were observed between the grains from the warehouses 1, 2, and 3. However, by inspecting the area of each peak in the chromatogram, it is possible to suggest that there are quantitative variations in the different soy samples. The proposed chemical structures are shown in Figure 3.
A total of 44 compounds were observed in the methanol extracts of soybeans, with 34 being identified. The chemical composition of the soybeans analyzed by LC-MS-ESI from methanolic extracts, consisted of organic acids, phenolic compounds, saccharides, amino acids, dipeptides, nitrogenous bases, nucleosides, sphingolipids, fatty acids, and flavonoids,
Some organic acids were observed in the first minute of the chromatogram. The molecular ions [M-H]at m/z 225 and m/z 195, according to Guerreiro et al. (2014), Li et al. (2016), and the Metlin library, were identified as glucoheptonic acid (2) and gluconic acid (3), respectively. Gluconic acid has already been determined in the root and shoot extracts of Glycine max L. (cv. Suzuyutaka) by capillary electrophoresis/mass spectrometry (Tawaraya et al., 2014).
The isocitric acid (isocitrate) (6) and citric acid (8) 2017) and Masike et al. (2017). Isocitrate (6) is part of the structure of the isocitratolase enzyme. This enzyme participates in the regulation of the glyoxylate cycle and plays a role in the seed germination process (Martins et al., 2000). The content of isocitrate in soybean seeds can vary with the type of storage and the different cultivars (Carvalho et al., 2014).  (Farag et al., 2017). Among the various studies consulted in the literature on the chemical constituents of soybeans, no reports were found on the presence of these compounds. Jasmonic acid and its derivatives are produced by different species of plants and are related to defense mechanisms. In Glycine max (L.) cv. G7R-315, the content of jasmonic acid has been studied in the pericarp of soybean seeds at various stages of development (Creelman et al., 1992).
Compounds 32, 36, 37, and 38 showed the molecular ions [M-H]at m/z 431, m/z 253, m/z 283, and m/z 269, respectively. The fragmentation spectra corresponded to the isoflavones genistin (genistein-7-O-glucoside) (32), daidzein (36), glycitein (37), and genistein (37), respectively. The presence of these compounds has already been described in different soy cultivars (Lozovaya et al., 2005;Lin & Harnly, 2007;Dueñas et al., 2012;Verardo et al., 2015). In the isoflavone region in the chromatogram (6.0 min -7.0 min), the molecular ions [M-H]at m/z 457 and m/z 473 were related to the formulas C23H22O10 (Error 0.65 ppm) and C23H22O11 (Error 0.0 ppm), and to the compounds daidzein acetylglycoside (34) and genistein acetylglycoside (35), respectively. Although these isoflavones did not show their fragment spectra at any collision energy, it was possible to infer their structures, since the exact mass enables the suggestion of the molecular formula and these compounds are well known components of soybeans. In addition, the elution order observed for isoflavones is consistent with other studies in the literature (Chen et al., 2005;Cavaliere et al., 2007;Dueñas et al., 2012;Verardo et al., 2015).
Studies have indicated that the presence and concentration of isoflavones in soybeans depend on environmental and genetic factors. Significant differences in the content of genistein, daidzein, and glycitein were found when variations in soybean type, soil type, and temperature were evaluated. Under controlled conditions, temperature was the factor that influenced the content of isoflavones the most. In general, low temperatures promoted higher levels of isoflavone, and the amplitude of the response depended on the type of cultivar analyzed (Lozovaya et al., 2005). In another study with ten soybean cultivars in two regions of Korea, relevant differences in the content of isoflavones, phenols, and antioxidant activity were observed, depending on the location and type of cultivar. Tryptophan, epicatechin, daidzin, and genistin were the major compounds found in soybean methanol extracts (Nam et al., 2014).
The same qualitative variation observed in the chemical composition of soybeans regarding flavonoids was also found for phenolic compounds when other works in the literature were compared (Cavaliere et al., 2007;Dueñas et al., 2012;Lee et al., 2014;Nam et al., 2014;Verardo et al., 2015;Guzmá n-Ortiz et al., 2017). The p-Cumaric (28) acid and p-cumaroylglucarate or p-cumaroylgalactarate (19) were the phenolic compounds identified in this study. The p-Cumaric acid is a phenolic compound commonly found in soybeans (Lee et al., 2008;Dueñas et al., 2012;Verardo et al., 2015;Guzmá n-Ortiz et al., 2017). , no reports of this compound in the chemical composition of soybeans were found. This compound presented the molecular ion [M-H]at m/z 355 and according to Lin & Harnley (2007) and Coutinho et al. (2016) it was identified as a cumaroyl ester of the glucaric or galactaric acid, the loss of 164 Da resulting from a McLafferty rearrangement in the ester group of the main fragment at m/z 191 ISSN 2166-0379 2021 [M-H-Cumaroyl] -. The p-Cumaroylglucarate has already been identified in the hydromethanolic extracts from sugar cane leaves (Coutinho et al., 2016) and orange peels (Lin & Harnley, 2007).

Journal of Agricultural Studies
When the various chemical characterization studies with soybeans were compared, qualitative and quantitative variations in the grain composition were observed. Therefore, it is possible to suggest that there is no single composition for soybeans. Environmental and genetic factors strongly affect these variations (Lozovaya et al., 2005;Lee et al., 2008;Nam et al., 2014;Freiria et al., 2016). Thus, climatic conditions, seasonality, soil quality, and growth stage can promote differences in the chemical composition of plants (Dhifi et al., 2016). In addition, the extraction process and all experimental conditions of analysis may have an effect on the presence or absence of a compound or classes of compounds in the chemical composition of a plant. This is the first time that the chemical composition of soy produced in the Triâ ngulo Mineiro region has been reported.

Mass Spectrometry Analysis by Direct Injection and Chemometrics of Soybean Samples From Warehouses 1, 2, and 3
The experiments by direct injection were carried out by directly injecting the methanolic solutions into the mass spectrometer with a continuous flow pumped by the HPLC. In this way, the acquisition of all ions present in the sample in the positive and negative modes was obtained without the previous chromatographic procedure. Figures 4 and 5, below, show the mass spectra containing the soy chemical profile, in positive and negative modes, for each warehouse analyzed. The results generated by LC-EM-ESI, in the negative and positive modes, initially suggested that the major chemical constituents or the components most sensitive to the mass spectrometry technique were the same in all of the three studied warehouses. However, through chemometrics, it was possible to verify that there is a variance between the soybeans collected in the different warehouses.
The results below refer to the PCA analysis with the spectral matrix recorded in the negative mode. Figure 6 presents the plot graph of the PCA scores of the three major Principal Components (PC), showing that they explained a variance of 98.79%, capturing almost all the information from the original data in the dimensional reduction. There is a well-defined separation between the three groups of samples, except for sample 20, which distanced itself from its original group but without interfering in the other ones. This result indicates a great difference in the chemical composition between the grains of warehouses 1, 2, and 3. Figure  7 presents the Loading plot of PC1, a graph that informs the m/z values used in the separation of the groups observed in Figure 6. Figure 8 presents the HCA graph which groups the samples according to their similarities. Thus, the closer to 100% the linking between two samples, the more chemically similar are their compositions. ISSN 2166-0379 2021 ISSN 2166-0379 2021 The HCA analysis shows that the soy samples collected in warehouses 1, 2, and 3 follow the pattern shown in the PCA, again exhibiting different chemical characteristics for the soybeans of each warehouse. It is also noticed that the grains from warehouses 1 and 3 have a higher similarity in their composition, differing from those of warehouse 2.

Journal of Agricultural Studies
The molecular ions [M-H]of m/z 191, 341, 671, 695, 746, 834, 858, 862, and 977 were related to the variance of soybean samples in the negative mode. The molecular ions at m/z 191 and m/z 341 have been identified by LC-MS-ESI and refer to the compounds isocitrate or citric acid, and sucrose, respectively.
The results below refer to the PCA analysis of the spectra in the positive mode (related to ions [M+H] + ). Figure 9 shows the 3D PCA graph constructed with the three main components which explained variance of 94.04%. As in the PCA analysis of the data obtained in the negative mode, a distinction was made between the samples of each warehouse and a small difference between the samples that form each group can be observed. This suggests that the better compounds ionized in a positive way (generally bases of Brönsted) have a variation regarding their presence or concentration in each seed, regardless of the warehouse. Figure 10 shows the main ions with m/z [M+H] + responsible for the variation between the soy samples from the three warehouses in PC2. Figure 11 represents the HCA graph, which shows the degree of similarity between the samples, and presents the similarity between the seeds from warehouses 2 and 3 regarding their contents. The mass spectrometry experiments by direct injection and by the PCA and HCA of chemometric analyzes showed that the three studied warehouses have different chemical constitutions.
The molecular ions [M+H] + that promoted the variance between the soybeans from warehouses 1, 2, and 3 were those at m/z 104, 230, 318, 655, 759, 783, 823, 893, and 920. The molecular ion [M+H] + at m/z 230 has been previously identified by LC-MS-ESI as xestoaminol C (Table 1), a sphingolipid (Huang et al., 2016). The molecular ions of m/z 759, 783, and 823 are related to the molecular formulas C43H87N2O6P, C45H87N2O6P, and C48H91N2O6P, respectively. According to the works of Fred & Tinoco (2015); Huang et al. (2016), and Qu et al. (2018), it is possible to suggest that these compounds are also sphingolipids classified as sphingomyelins. It is noteworthy here that these non-polar high

Journal of Agricultural Studies
ISSN 2166-0379 2021, Vol. 9, No. 2 molecular mass compounds were observed only when the experiment was carried out by direct injection of soybean extracts in the mass spectrometer. The chromatographic conditions used in the LC-MS-ESI experiments hindered the observation of these compounds.

Conclusion
The soy produced in the Triâ ngulo Mineiro/MG region is a source of different classes of metabolites, with emphasis on phenolic compounds, flavonoids, and isoflavones. The methodology employed does not allow us to identify all the chemical components present, since the extraction and analytical methods, in addition to environmental conditions, soil composition, type of cultivar, and growth stage, may affect the chemical composition of soybeans. Nevertheless, the chemical profile of the grains produced in this region, which was previously unknown, was determined. This profile was proposed using the extraction with methanol, a polar solvent. The soybeans from the three warehouses investigated showed similar qualitative chemical profiles when analyzed by liquid chromatography coupled with mass spectrometry. However, the PCA and HCA analyses revealed that positive and negative ions contributed to the variance in the chemical composition between the soybeans of each warehouse.