3. Technical Platform
Return to Working Plan for Taking WiserEarth in Multiple Languages
Instruction: Discuss in the comment section and summarize here. Upon saving edits, copy the summary to the appropriate section of the main page linked above. This ensure that the main document contains the most updated info.
This section intends to identify all the technical needs for this endeavor
3.1 Data Center/Server
Currently the data center for WiserEarth is based in the US, Texas. We may decide at first to put all the sites there, and analyze the performance. it may be necessary after to set up another data center in Europe (Europe has data privacy rules that would make it difficult) or Asia. Using a combination of Amazon EC2 (to augment processing power) and Amazon Simple DB or even FreeBase to track all user-generated translations.
3.2 Technical Features
- Ability to translate fields and dialog boxes from WiserEarth in English into local site, or vice versa.
- Ability for users to express language preference either through browser settings / and / or user profile
- Ability for users to toggle between different language versions of the same content
- Ability for users to select text-only or expanded graphics features.
- Ability for an administrator to send to an automatic queue some articles and key profiles that need to be translated so that any users can volunter to do some translations and work on them right away (see wiki where users vote for translation into specific language to guage demand)
- Question of how the sites operate - separate instances (don't think so as that is too hard to manage updates), more probably same UI with translated fields and filter on what language content to show.
- Data sharing between all languages
- Facilities for translation teams to provide translations.
- Human-and-machine translation server to build a WiserEarth community translation memory and assist further translation effort (see discussion for more details)
- The translation would be displayed instead of the original item according to the preferences of the reader. The type of translation (by author, by team or automatic) would be clear and a button would be available to see the item in the original or in another language.
- To develop a feature that allows organizations to create their profile withing wiserEarth platform. And allow them to have access to it, throught their own domain name. That would be a powerful step towards Wiser Commons.
3.3 Localized homepage
- Ability to automatically queue / rotate content on homepage based on AOFs so that there is minimal actual management required
3.4 Ongoing maintenance of code/debugging
Having a multilingual version of WiserEarth under a single platform, thus hosted on the same server, enables the maintenance process to be done by the same team. There may be the need to provide specialized features to reflect upon cultural divesity such as the need to enable a more oral-oriented participation in African nations. The development of the technical specification/requirements for the feature can be done by the local/regional community of users, but the actual coding can be done by the core team of developers to ensure the quality and consistency of the code.
Comments (1 - 8 of 8)
|
Flag comment for removal bowo 7 months ago
Note: Have just copied JP's edits to the main document here and add some of my own.
|
|
@Roger: Have incorporated some of your suggestion in 4. Governance Model regarding technical features.
@Angus: The system I proposed seems to be working quite well for international firms who need to be present in multiple languages. Though I haven't actually have an experience in using them extensively, the proliferation of such tools in the professional translators community seems like a good indicator that it's actually useful to help with translation efforts. And it needs saying that although it will be useful in the long run to help translate content between any two language, it won't be immediately useful to do initial localization effort of translating the UI and key pages to a different language. This will have to be manual labor. |
|
@Rehan: Amazon SimpleDB and Freebase does sound interesting. Freebase seems especially relevant to expand the reach and usefulness of WiserEarth's database.
"Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers.
"Freebase is an open database of the world’s information. It is built by the community and for the community—free for anyone to query, contribute to, build applications on top of, or integrate into their websites. Already, Freebase covers millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC, it contains structured information on many popular topics, like movies, music, people and locations—all reconciled and freely available via an open API. This information is supplemented by the efforts of a passionate global community of users, who are working together to add structured information on everything from philosophy to European railway stations to the chemical properties of common food ingredients. ... while information in Freebase appears to be structured much like a conventional database, it’s actually built on a system that allows any user to contribute to the schemas—or frameworks—that hold the data. This wiki-like approach to structuring information lets many people organize the database without formal, centralized planning. And it lets subject experts who don’t have database expertise find one another, and then build and maintain the data in their domain of interest.
Here are three good reasons to contribute to Freebase: 1) You’ve got a bunch of data that you’d like to share with the world. Freebase gives you a place to do it. A related benefit: once your data is in Freebase, you or anyone else can run MQL (Metaweb Query Language) queries against it. 2) You’ve got a bunch of data that you’d like to share, and said data would benefit from the knowledge and refinement efforts of other people. Freebase gives you a place to share it and others a place to improve it. 3) You don’t have data, but you're an authority on something, and you like sharing your expertise. Freebase lets you dive into the details and improve or add to existing data." |
|
@ Bowo: Great work on translating - need to really keep that in mind for localization. Presume a system such as the ones you describe would make it easier to do each new localization.
|
|
@ Roger: I think the idea for now is one site / one instance but the platform handles multiple languages. However, wikipedia has multiple distinct sites (i.e. one for each language) which bears remarking .....
|
|
> Currently the data center for WiserEarth is based in the US, Texas. > We may decide at first to put all the sites there
This phrase, "put all the sites" suggests to me that languages will be implemented as entirely separate WiserEarth sites. If I have interpreted correctly, then this is a basic decision that needs to be discussed. By having separate WiserEarth sites for each language, we lose the advantage of getting everyone on the same page. Surely there is a way to handle separate languages on the same site. |
|
How about having a human-and-machine translation server for all translation effort done in/for WiserEarth? This should help ease further translation effort. Below are the basic concept copied from a discussion on What does it mean to Internationalize?
Camilla asked: - does the community automate the translation? - does the community look for volunteer translators and do this manually?
Have done some serendipitous research in this regard, with the answer being both. The translation effort must start manually, but can be automated to a certain degree once sufficient "translation database between languages" is developed by the community.
Discovered that in the translator community they have softwares that aid their work in translating documents. The latest development of the software enables the translation know-how of a great number of translators be aggregated into one large software-assisted human-auto-translation engine. Much like google translation, but using the database of translations by real, professional translators!
WiserEarth can imagine setting up such translation software/databse and integrate it into WiserPlatform somehow. Or if technically too difficult / impossible, WE can enable the community of volunteer translators to access, add and extract out translations from the software/database to then easily copy paste into WiserEarth's wikipages/wikispaces for further editing/refinements.
Here's a breakdown of the concept, explained further below it: 1. Each translator can develop his/her own translation memory (TM). 2. A termbase (TB) for each area of focus can be developed together by all translators. 3. These TMs and TBs can then be integrated into the translation software/database. 4. Any new translation effort can benefit from this database, where the workload for each translation can be reduced significantly.
Now, a more detailed explanation of each:
1. Each translator can develop his/her own translation memory (TM).
A translation memory is a linguistic database that continually captures your translations as your work for future use. All
previous translations are accumulated within the translation memory (in
source and target language pairs called translation units) and reused
so that you never have to translate the same sentence twice. The more
you build up your translation memory, the faster you can translate
subsequent translations, enabling you to take on more projects and
increase your revenue.
2. A termbase (TB) for each area of focus can be developed together by all translators. Again from a leading software in Computer Aided Translation (CAT): Terminology
is the foundation of all communication. At its most basic level it is
the study and ultimately usage of words or phrases that have a
particular meaning, these words or phrases are referred to as terms.
Terminology is growing in importance as terms are becoming increasingly
adopted by organizations to describe a company, product, service or
even a unique selling point. A
termbase is a central repository, similar to a database, which allows
for the systematic management of approved terms. It provides
definitions and indicates when a particular term should be used. Use of
a termbase alongside your existing translation environment ensures that
you produce more accurate and consistent translations.
3. These TMs and TBs can then be integrated into the translation software/database.
I found three example of this where the TM of a large number of translators is connected via the web and thus accessible to all translators: a. Lingotek's Language Search Engine (LSE). Commercial? b. Wordfast's Very Large Translation Memories (VLTM). Partly-commercial. The client comes at a cost, the VLTM is free. c. Across Language Server. Partly-commercial. Personal edition is free with access to some key features of the server.
4. Any new translation effort can benefit from this database, where the
workload for each translation can be reduced significantly. As a result of the above, and as the database of translation for
sentences and terminology expands, each new translation effort takes
less time to do and gains in accuracy.
Concluding this section, I'll restate my previous point:
WiserEarth can imagine setting up such translation software/databse and integrate it into WiserPlatform somehow. Or if technically too difficult / impossible, WE can enable the community of volunteer translators to access, add and extract out translations from the software/database to then easily copy paste into WiserEarth's wikipages/wikispaces for further editing/refinements. |
|
Amazon EC2 is highly volatile and designed to augment processing power. It should not be used as a "server". The key technical issue for translations seems to be data storage.
It would be interesting to see how we could use Amazon Simple DB or even FreeBase to track all user-generated translations. |

