Аутор Тема: Autosomal PCA charts of Europe and Near East  (Прочитано 5643 пута)

Ван мреже Александар Невски

  • Редакција СДНКП
  • Истраживач
  • ******
  • Поруке: 1135
Autosomal PCA charts of Europe and Near East
« послато: Фебруар 08, 2014, 04:05:12 поподне »
Having done my Family Finder autosomal test, I noticed there is shortage on internet of complete and multidimensional PCA charts of European and Near-eastern populations. Ofcourse, there are many BGA charts of dr. Doug McDonald, but they are usually done with limited population sets, and from them layman rarely can figure out what dimensions are they. McDonald in his PCAs dont use all available european population sets (for example no other Balkan population except Romanians can be found), so I have decided to fill this gap.
In autumn of 2013 I made my software for PCA calculations and started using it with all available free genome collections on internet. For now, I use HGDP (http://www.hagsc.org/hgdp/files.html), Yunusbayev et al. (http://www.evolutsioon.ut.ee/MAIT/caucasus_data/), Behar et al (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE21478), 1000 Genomes (http://www.1000genomes.org/) and Hellenthal G, et al. 2014 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE53626).
I also use about 20 samples from OpenSNP collection (http://www.ianlogan.co.uk/23andme/23andMe_opensnp.htm), when I could conlude to whom people samples belong.

About PCA (Principal Component Analysis) you can see here: http://en.wikipedia.org/wiki/Principal_component_analysis

I decided to show here first four axis (dimensions) of Europe with Near East and three North African populations.

Here are 1-st axis (vertical) and 2-nd (horizontal). I deliberatedly swap its usual order so it could as much look like geographic map of Europe and surrounding regions. You may notice that Europeans are much compressed on axis two, due to much variation on it between Near East, Caucasus and Northwestern Africa.

This is enlarged European part of first two axis.

This is chart of axis 3 (vertical) and 1 (horizontal). Here are european populations much more differentiated.

This is chart of axis 4 (vertical) and 1 (horizontal). And here are also european populations more differentiated than on axis 2.

Here is enlarged European part of axis 1 and 4.

This are first two axis of Europe without surroundings. Axis 1 is vertical. Here are european populations much more differentiated than on axis 1 and 2 with Near East and North Africa. Geoergia na Cyprus are here to show way to Near East and Caucasus.

For this PCAs I used next samples from HGDP collection: French, Basque, Sardinian, Tuscan, Northern Italian, Orcadian, Palestinian, Mozabite, Bedouin (Israel), Druze, Adyghe and Vologda Russian.

From Yunusbayev et al. I used Abkhaz, Armenian, Bulgarian, Ukrainian, Chechen, Mordvin and Nogai samples.

From Behar et al. used are Romanian, Hungarian, Belorussian, Lithuanian, Spanish, Chuvash, Turkish, Cyprus, Moroccan, Egyptian, Georgian, Syrian, Saudi Arabian, Yemeni, Sephardic and Ashkenazi Jews.

From "1000 Genomes" I used Finnish, more Tuscan, British from Kent, Cornwall, and Scotland.

From "Hellenthal G, et al. 2014" I used Polish, Norwegian, Hellenic (Greek), Southern Italian, Sicilian and Tunisian samples.

From "OpenSNP" collection I have mostly per one or two of several remaining European populations, for example Germans or Danish. Some samples behave very strangely on PCA charts, for example only sample I have from Portugal. In future I might exclude such who do not cluster where they are expected.

If someone wants to take part with his sample in my PCA (and other) calculations, please send it on next email.

I accept FTDNA or 23AndMe samples from European populations I don't have at all or don't have in sufficient (two-digit) numbers, especially from Balkans. These needed populations are: Serbs, Croats, Slovenes, Bosnians and Herzegovinians of all religions, Macedonians, Albanians, Portugal, Slovakia, Rusins, Moldovans, Czechs, Germans, Austrians, Swiss, Irish, Netherlands, Lapponians and Denmark.
Please send me only samples where all four grandparents are from the same nation. And please send me in mail places from which place/region are all four of them, together with their ethnicity and religion (if it makes difference, like in Balkan countries as Macedonia, Bosna or Bulgaria). In exchange for right to use it in my calculations, I shall in response send you images with your sample's projections on first four axes. And I shall not in any case disclose any private data, like name of sample or anything other. I shall not respond to samples which does not fit declared ethnicity. For example, if someone claims he is full blooded Spanish but clusters with Russians, I shall not send any response nor use it. And please dont send samples which are of mixed origin, like half Swedish - half Italian.

From time to time I shall update charts on this page, and maybe add new ones, as I find or receive new samples. I keep this page closed, but if someone wants to comment it, please open another one. I hope to open page like this on russian as soon as I find someone to give me proper translation.
« Последња измена: Фебруар 22, 2014, 07:50:00 пре подне Александар Невски »
Србски пѣсник Лаза Костић: "у млазових прочитам сричући" "по уздасих тако први' у јунака реч поврви"