NEW legislation aiming to protect the public from telephone scams and cold-calling is under construction, and will attempt to attack it at source by tightening up on commercial use of customers' personal data.
New algorithm may help trace origins of 'rootless' Basque language
21/10/2020
AN ALGORITHM able to decipher 'dead languages' could throw light on one of Spain's biggest linguistic mysteries: Where the Basque tongue, euskera, comes from.
Spanish, and all of Spain's regional languages except euskera, have their roots in Latin and are known as the 'romance languages', along with, for example, French, Italian, Portuguese and Romanian.
Other groups include Germanic, which covers the Scandinavian tongues, and Celtic, which embraces Ghàidligh, Irish Gaelic and Cornish.
But euskera, said to be incredibly hard – in fact, nearly impossible – for non-native speakers to learn to a level of effective communication on any subject, appears to have no known roots; no other language on Earth has been found to be related to it.
Another, older linguistic 'mystery' is that of íbero, or the Iberian language – the indigenous tongue spoken by some of modern-day Spain's earliest human inhabitants, which stretched as far as southern France in one direction and inland Andalucía in the other.
Its native speakers would have been alive between about the seventh and first centuries BCE, or around 2,020 to 2,600 years ago, and was most in use before the Migration Era, thought to have been in the late fourth century CE (AD).
Iberian is thought to have died out in the first 200 years of the last Millennium, since the spread of the Roman Empire into what is now mainland Spain and Portugal saw Latin becoming the most-used tongue.
It is referred to as a 'Paleohispanic language', of which euskera is the only one left and has no links to any other tongue in current use.
Speakers make up just under three in 10 inhabitants of the Spanish Basque territories – the Basque Country's three provinces, and neighbouring Navarra – and three former provinces in France, just over the border; a total of around 751,500 all told, or roughly equivalent to the population of Valencia city, and of whom over 90% are on the 'Spanish side'.
If, as some linguistic experts suspect, euskera is derived from the original Iberian tongue, this would make it the oldest language in Spain in modern use.
Researchers from the Computer Sciences and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT) have developed a programme which, using only a few thousand words of a given language, can point towards its possible roots.
According to Professor Regina Barzilay of the MIT team, it works through accessing a corpus of texts of modern and ancient languages, drawing on existing linguistic history knowledge, to make comparisons.
Language evolution has largely been predictable, Professor Barzilay explains: As an example, if a given language retains or omits a complete sound, it is likely that a comparable sound-substitution will be included, so a 'p' in the 'main' tongue might be replaced with a 'b' in an offshoot language, but would probably not be replaced with a 'k', which is a completely different phonemic sound.
Working with PhD student Jiaming Luo, the pair devised an algorithm which detects microscopic changes and similarities in pronunciation to form a logical rule-base through 'chopping up' words in an ancient language.
Last year, they wrote a paper after deciphering the dead Ugaritic tongue – a semitic language which had been extinct since the 12th century BCE but was discovered by archaeologists in what is now the city of Ras Shamra in Syria – and also the so-called Linear B written language system, used in Mycenaean Greece during the end of the Bronze Age, from around 1600 to 1100 BCE.
Even though they had worked out that these tongues were linked to modern Hebrew and Greek, it still took several decades for the linguistic community to unravel them.
Yet the algorithm they have developed may be able to do this in a matter of hours.
Professor Barzilay and Jiaming Luo say they are working on cracking texts based upon the meaning of the words, or semantics, enabling them to decipher languages they are unsure how to pronounce correctly.
“For example, we might be able to identify all the references to people or places in a document, which could then be researched in their historical context – methods which are commonly and currently used in several processing applications and which are very accurate – but the key issue in this research is whether it's a feasible task when you have no data on usage of an ancient language,” explains Professor Barzilay.
Photograph 1, of Bilbao's Guggenheim Museum from Flickr
Photograph 2, a district map of the Basque Country, by Joan M. Borràs (ebrenc) on Wikimedia Commons
Related Topics
AN ALGORITHM able to decipher 'dead languages' could throw light on one of Spain's biggest linguistic mysteries: Where the Basque tongue, euskera, comes from.
Spanish, and all of Spain's regional languages except euskera, have their roots in Latin and are known as the 'romance languages', along with, for example, French, Italian, Portuguese and Romanian.
Other groups include Germanic, which covers the Scandinavian tongues, and Celtic, which embraces Ghàidligh, Irish Gaelic and Cornish.
But euskera, said to be incredibly hard – in fact, nearly impossible – for non-native speakers to learn to a level of effective communication on any subject, appears to have no known roots; no other language on Earth has been found to be related to it.
Another, older linguistic 'mystery' is that of íbero, or the Iberian language – the indigenous tongue spoken by some of modern-day Spain's earliest human inhabitants, which stretched as far as southern France in one direction and inland Andalucía in the other.
Its native speakers would have been alive between about the seventh and first centuries BCE, or around 2,020 to 2,600 years ago, and was most in use before the Migration Era, thought to have been in the late fourth century CE (AD).
Iberian is thought to have died out in the first 200 years of the last Millennium, since the spread of the Roman Empire into what is now mainland Spain and Portugal saw Latin becoming the most-used tongue.
It is referred to as a 'Paleohispanic language', of which euskera is the only one left and has no links to any other tongue in current use.
Speakers make up just under three in 10 inhabitants of the Spanish Basque territories – the Basque Country's three provinces, and neighbouring Navarra – and three former provinces in France, just over the border; a total of around 751,500 all told, or roughly equivalent to the population of Valencia city, and of whom over 90% are on the 'Spanish side'.
If, as some linguistic experts suspect, euskera is derived from the original Iberian tongue, this would make it the oldest language in Spain in modern use.
Researchers from the Computer Sciences and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT) have developed a programme which, using only a few thousand words of a given language, can point towards its possible roots.
According to Professor Regina Barzilay of the MIT team, it works through accessing a corpus of texts of modern and ancient languages, drawing on existing linguistic history knowledge, to make comparisons.
Language evolution has largely been predictable, Professor Barzilay explains: As an example, if a given language retains or omits a complete sound, it is likely that a comparable sound-substitution will be included, so a 'p' in the 'main' tongue might be replaced with a 'b' in an offshoot language, but would probably not be replaced with a 'k', which is a completely different phonemic sound.
Working with PhD student Jiaming Luo, the pair devised an algorithm which detects microscopic changes and similarities in pronunciation to form a logical rule-base through 'chopping up' words in an ancient language.
Last year, they wrote a paper after deciphering the dead Ugaritic tongue – a semitic language which had been extinct since the 12th century BCE but was discovered by archaeologists in what is now the city of Ras Shamra in Syria – and also the so-called Linear B written language system, used in Mycenaean Greece during the end of the Bronze Age, from around 1600 to 1100 BCE.
Even though they had worked out that these tongues were linked to modern Hebrew and Greek, it still took several decades for the linguistic community to unravel them.
Yet the algorithm they have developed may be able to do this in a matter of hours.
Professor Barzilay and Jiaming Luo say they are working on cracking texts based upon the meaning of the words, or semantics, enabling them to decipher languages they are unsure how to pronounce correctly.
“For example, we might be able to identify all the references to people or places in a document, which could then be researched in their historical context – methods which are commonly and currently used in several processing applications and which are very accurate – but the key issue in this research is whether it's a feasible task when you have no data on usage of an ancient language,” explains Professor Barzilay.
Photograph 1, of Bilbao's Guggenheim Museum from Flickr
Photograph 2, a district map of the Basque Country, by Joan M. Borràs (ebrenc) on Wikimedia Commons
Related Topics
More News & Information
OUTER space and the Bronze Age do not sit well in the same sentence – they may both have existed at the same time, but anyone based on Earth back then would not have known much, or anything, about what lies beyond.
A FIRM annual fixture for fans of the latest technology, the Barcelona-based Mobile World Congress (MWC) never fails to blow visitors' minds with creations they didn't know they needed. And these cutting-edge...
A HOLLYWOOD legend joining folk-dancers from Asturias and showing off her fancy footwork in the street is not a scene your average Oviedo resident witnesses during his or her weekly shop. Even though their northern...