On Dan Miczarski’s podcast Shift Left (a production of Blackbird.io), we scrutinize the language industry tradition of the Localization Project and ask: Is the concept still relevant and useful? What might be a better way?
By the way, this is not the first challenge to the venerated Localization Project. Way back in 2014 Rubén Pérez García wrote the article “Localisation Service Management Principles” in Localisation Focus, the International Journal of Localisation (VOL. 13, Issue 1).
He argues that although localization operations have traditionally been based on the principles of project management, the more appropriate framework might actually be IT service management.
I believe that his argument is more relevant than ever and would recommend his article as “required reading” on the topic.
The following “Compton Classic” article was published October 2017 on the now defunct Moravia Blog and is being re-published here for posterity. Minor edits may have been made to support relevance and clarity.
As technology evolves, so too does our idea of what content is. Let’s dive into a new theory: that content does not equal text, and think about what the future might hold.
In my blog post Is There Such a Thing as a Foreign Market? I elaborate on one of the key assertions made in the article “Sound and Vision,” published in the September issue of MultiLingual magazine. In both pieces, I suggest that the concept of the “foreign market” may be approaching the end of its useful life.
In this post, I’d like to elaborate on a second key assertion from that article: Your content does not equal its form.
The principle of separating content from form has in large part been institutionalized, thanks to advancements such as CSS, XML-based publishing, and systems designed around the concepts of component content management and single-source publishing.
Well, at least in the world of text, anyway.
If you’re talking about content that doesn’t exist natively as text—including audio and video—then as far as I’m concerned, we’re still living in the wild frontier.
But I’d like to punch it up a bit, because I don’t think that very many people would disagree with the above. So, let’s make the assertion bolder: Content does not equal text.
If not text, then what?
I’m going to need to drop a little ontology on you here.
Ontology is a field that originates from classic philosophy and attempts to answer the question, “what is a thing?” as opposed to what’s a characteristic of a thing, a part of a thing, a collection of things, a permutation of a thing, etc. (i.e. what is something’s “category of being”). Not surprisingly, it’s an important concept in computer science where abstract concepts need to be made more concrete so that they can be modelled and developed into working systems.
In the mid-to-late 1990s, it was the application of ontological thinking that yielded the assertion that books, documents, web pages, etc., aren’t content; they’re forms of content.
It’s an idea that gave birth to the content management revolution, which in turn helped to create today’s content and channel-rich world. If you’re a content producer, this is why your authoring efforts can be deployed across more channels more flexibly and affordably than ever before.
The model of content versus form at the heart of many contemporary content management systems.
In this now-standard model, the “ground-level object” (the “thing”) is assumed to be some unit of text, encoded in some human-readable language. Pick any sentence in this blog post, for example. It’s the molecule in the model.
But what if we break the model down even further and look at what comprises a unit of text?
Focusing on a new ground-level object: the core idea.
The new ground-level object becomes something like a sememe: a stand-alone unit encapsulating the core idea of a message. The atoms that make the molecule. Everything else is a representation of that idea, the result of that idea going through a multi-layered process of encoding.
In the case of me typing this sentence, the idea was first encoded by my brain into English words, then encoded into written English text by my fingers on the keyboard, then ultimately encoded into a digital object by my laptop.
It’s intuitive that ideas can live outside the context of written language; we have non-written conversations with other people all the time. And the existence of art and music demonstrates that a core idea can be expressed outside of the traditional construct of “human language” altogether.
But representing that idea in a computer system without anchoring it to a piece of written, human-language, interpretable text? Well, that’s a different challenge altogether.
But it’s one worth tackling, because…
Our digital world is becoming less text-centric
With the shift toward mobile computing, advances in speech recognition technology, the rise of new social channels, the proliferation of internet-connected appliances, and the ease by which new types of digital content can be created by anyone, there’s been an increase in the amount of audio, video,and other classes of non-text-based digital information out there.
And we’re no longer necessarily typing on keyboardsthese days either.
All of this content lives in different modalities, but we don’t want it to be siloed into those different modes. This is the grand promise of the internet: all the world’s information should be accessible and useful from anywhere. Regardless of language…and form.
One solution to indexing content with its meaning has been through semantic metadata that travels with content. This approach helps make content more findable and usable across its various modes, but it doesn’t necessarily address the challenge of the human language barrier,as these tags and namespaces are often based on—you guessed it—strings of human-language-encoded text.
You can see this limitation in action by going to any search engine that allows you to find images using text. Search for “kitten”. Now search for “小猫” or “gatito” or “kätzchen”. In all cases, the engine will return pictures of baby cats, but you’ll notice that the sets of results you get back are entirely different for each search term. Each of these pictures or videos lives in—from the standpoint of findability—different internets, segregated by language.
Nothing to see here. Move along.
To remove this language barrier using metadata, you’d need to either translate all the world’s metadata into all human languages, or find alternative ways of adding content descriptors that aren’t language-dependent.
When it comes to voice content, for now, text appears to be the lingua franca for brokering spoken language differences too.
The translation industry seems poised to respond to the upcoming explosion of voice-based systems with a combination of speech recognition, transcription, machine translation, and text-to-speech technologies. And while these technologies are becoming impressively mature, used in tandem they will only ever be as strong as the weakest link in the chain. An error in one step will only compound as it travels through the process, yielding results that are incoherent (but can be surprisingly entertaining).
Is there a better way?
Text-free alternatives?
In another post, I’ll further explore how “core ideas” might be encoded without the use of text—or at least without the use of text that belongs to a specific written human language.
And maybe “meaning” doesn’t need to beencoded at all, as long as we can associate content with enough raw statistical data about its characteristics, and can call upon those characteristics in a way that consistently returns and lets us use the “things” that we’re looking for.
With regard to voice translation, there is already movement in the direction of removing intermediary text-based steps.
This approach would cut out transcription and text-based machine translation altogether, and may support systems that work closer to the way human interpreters do.
What do you think? Do you see a benefit in separating content from its form even further? How might our content management and translation practices need to evolve in a “post-text” digital world? Let’s start a dialog!
The following “Compton Classic” article was published October 2017 on the now defunct Moravia Blog and is being re-published here for posterity. Minor edits may have been made to support relevance and clarity.
If you’re a reader of MultiLingual magazine, you may have caught the article I contributed to the September issue dedicated to the theme: audiovisual content. (If not, I’d encourage you to check it out!)
In “Sound and vision,” I describe what I see as the collision of two trajectories that’s resulted in a crisis: the tradition of multimedia being largely excluded from globalization programs, and multimedia’s meteoric rise as a dominant communication channel. Traditional localization, I assert, is not designed to bridge this gap at all.
In the article, I suggest that the solution must start with us challenging some of our classic beliefs about localization. I make some bold assertions that I’d like to expand here, in the broader context of globalization strategy in general.
Assertion: There are no foreign markets.
To quote from the article: the concept of foreign markets is a pre-globalization holdover from an era when businesses were born and raised within the walls of a single nation-state, only later to expand abroad as a logical next step for growth.
The approach to establish a successful product locally, then expand globally, has been such a successful growth model for so many companies that it’s stamped into our working definition of localization:
Localization (also referred to as “l10n”) is the process of adapting a product or content to a specific locale or market. Translation is only one of several elements of the localization process.
The above, taken from the Globalization and Localization Association (GALA) is consistent with other definitions you’re likely to come across. It’s all about adaptation to foreign markets by creating new versions.
There’s nothing inherently wrong with this approach, but it raises some significant questions. Why wait to add the characteristics required for global success until after your “baseline” product has already been designed and created? And, what does it cost you—including in ground yielded to the competition—to factor an “Act II” into your plans for world domination?
You may stand by the adaptation model and say, “but localization is all about addressing characteristics like language and culture that are by definition only relevant to specific markets. These characteristics must be adapted or they won’t work worldwide.”
And I would partially agree. It’s crucial that the characteristics of language and culture be managed in a global product and campaign. But there’s a better way than post-creation adaptation. First, let’s look at the idea of hyper-personalization.
Real-world examples of hyper-personalization in action include advertisements that “nudge” me to re-buy something I’ve previously bought, or aggregated “You May Also Like” links that are uncannily accurate. These demonstrate that products, campaigns, and content can dynamically change their form—not only at the level of an individual’s personal preferences and requirements, but within a specific context (in a car travelling on vacation, on their birthday, after the birth of a baby, etc.).
It forces us to ask the question: if language and culture aren’t characteristics that should be inherently part of the personal preference equation, then what should be?
And if the new bar for target marketing is at the granularity of an individual at any particular moment, then what value is this concept of a “foreign market”?
A problematic metaphor?
Markets, demographics, borders: none of them are realin any physical sense. They’re abstract constructs, like metaphors, that are help us to simplify the world so that we can take action. They’re useful…
When it comes to globalization strategy, there are two reasons why I believe the concept of foreign markets may be more harmful than helpful today:
From the standpoint of the design process, it encourages a form of procrastination that results in an under-scoping of requirements, risking that your core product/campaign/content won’t work outside of your “home market” at all. If it can be made to work in other places down the line, it’s a massively expensive remodel project. (“They drive on what side of the road here? That doesn’t sound like my problem, that sounds like an adaptation problem.”)
It suggests that organizations have the luxury of deciding what markets they’d like to enter versus not enter, and that entering a new market can be taken on as an expansion project.
To believe this, one would need to ignore the fact that global organizations wield a massive competitive advantage over non-global organizations that not only leverages scale, but scope. Truly world-class ideas, products and brands transcend geographical, political, linguistic, and cultural borders. They work and are available anywhere in the world.
Think about all the things that you take for granted being able to can use (your phone, your credit card) or can buy (Red Bull, Nike shoes) pretty much anywhere in the worldwhen you’re travelling internationally.
Without this quality of “useful anywhere and everywhere,” your product or idea is at risk of being more wholly implemented by a more global competitor, or by local in-market competitors who are better able to locally manage the linguistic and cultural characteristics of your model.
And in-country competition can evolve into global competition as per the examples of Didi and Showmax.
Takeaway? “Starting global” beats “going global.”
One world market
If market segmentation is a construct of convenience bolstered by tradition, what if we shake it up? What if there were only one world market, or if you prefer, seven and a half billion markets (give or take)—one for every person on the planet?
How would that change the way we approach product and campaign design? How might we differently approach the very real, very important task of making sure our initiatives transcend linguistic and cultural barriers so that they can succeed anywhere and everywhere?
This idea of applying hyper-personalization models to globalization strategy—where language, culture, and other globalization characteristics get factored into the core design of a product or campaign—is promising. I predict that its application will yield massive rewards in terms of global scalability. But to get there from here requires viewing the world through the lens of “one world market” and scrapping the trope of foreign markets altogether.
What are your thoughts? Is this an idea that resonates with your experience? Do you see the world through a different lens altogether? Let’s start a dialog!
The following “Compton Classic” article was published in the September 2017 issue of MultiLingual magazine and is being re-published here for posterity. Minor edits may have been made to support relevance and clarity.
For those of us who are responsible for content globalization programs, it has become increasingly important to manage all content comprehensively, regardless of the language it’s in, who creates it or in what forms it’s packaged (text-based documents, audiovisual media or something else entirely).
I posit this: audiovisual content is not actually a different type of content. It represents an expression of your core message that just happens to have audiovisual characteristics. Why does the distinction matter? Because including such media as part of your program means being able to take advantage of today’s rich and dynamic communication landscape, as opposed to being taken advantage of by it.
“Multimedia is too expensive”
In November 2013, I had the privilege of attending the Adobe Experience Manager Multilingual Content Special Interest Group meeting at the Adobe headquarters in San Jose. This was a forum where Adobe product managers, content managers from large global enterprises and folks from the localization industry got together to discuss multilingual content management in a casual, collaborative setting. The focus of this session was digital asset management (DAM); specifically, “how DAM can be used for multilingual purposes.”
One conversation struck me. A representative from Adobe asked the enterprise content managers in the room what seemed like a perfectly innocuous question, the gist being, “how can we make the creation and management of multilingual audio and video digital assets easier for you?”
The collective answer as I recall was a resounding, “We don’t care. Multimedia isn’t included as part of our content globalization program.”
I was stunned. I had to ask the room, “Why not?” after which someone offered me a straightforward answer that was essentially “Because it’s too damn expensive.”
“Oh, OK,” I remember saying, then retreating into a moment of self-reflection. I believed the answer intuitively, but I wasn’t happy about it. I had spent my professional career removing the language barrier for organizations trying to deploy their message, their products and their value worldwide.
Audiovisual media is everywhere. Why should it be excluded from the language-barrier-destroying efforts of localization? And what are the implications of doing so in the context of audiovisual media’s growing importance in an ever-flattening world?
Audiovisual media and the content globalization crisis
Fast-forward to today, almost four years later, and I assert that the world finds itself in the middle of a content globalization crisis that’s been made worse from the legacy of excluding audiovisual content.
What is this crisis?
Traditional content globalization practices are insufficient to address today’s challenges. This has created a divide that threatens every organization’s ability to reach its full global potential.
The divide is between an organization’s need to remove language and cultural barriers, and its ability to do so. In a world where global scale can propel a company well ahead of the competition, the ability to bridge this divide can mean the difference between winning and losing, even if your solution or product is the best.
Let’s color this in. Say you’ve got a brilliant new solution or product that’s destined to be deployed worldwide — for example, a cure for the common cold.
Were it 20 years ago, you might have only a handful of communication channels to manage in order to successfully get your message out to the world — a website, some print marketing, maybe a public relations function. And for the most part, these channels were predictable and controllable: you create appropriate content for each channel in a source language and then deploy it per a schedule of your design. The local market would then consume it, after which you’d measure the impact and then refine the messaging if necessary.
Sometime further down the road comes localization, another controlled process in which that source content would be adapted for foreign markets, resulting in translated and adapted target content. After deploying this content, you might measure its impact in each market, but you might not be able to do anything about the results if you weren’t set up to process multilingual and multicultural market feedback. The campaign wasn’t well received in Indonesia? That’s too bad; I guess we won’t be selling much there.
Occasionally, multimedia content might be included, say if there were training material involved, but it wouldn’t be typical, and would represent a special type of project with longer durations and higher costs (sometimes even “too damn expensive”).
Social media circa 1995.
The speed of now
The world has changed in the last 20 years and the old rules no longer apply.
Crowd models. Cloud computing. Hyper-personalization. Cheap, always-on mobile internet. Smartphones. Social media. More than 20 years of evolving technologies have combined like strands of DNA to create new, multiplied capabilities. These technological advancements have both empowered us and shaped our expectations. And for the current generation of decision-making adults, it represents the way their world has always been.
Managing these communication channels has become more complex, starting with the number of channels you need to manage. Plus, you also need to manage search across a plurality of platforms, mobile apps, a half dozen social networks, community-driven user forums, and various marketplaces in addition to your website, print marketing and formal public relations activities. The lines that divide these channels have become blurry (such as YouTube-hosted webinars that are promoted and discussed on LinkedIn, Twitter and Facebook).
And new channels can appear seemingly out of nowhere, quickly gain critical mass and then just as quickly disappear into obscurity.
The basic nature of these new classes of channels is very different from the classic channels. They’re omnidirectional. Not only do they represent communication from you to the market, but also communication from the market to you, and discussions about you (a universal need that gave birth to the formal practice of sentiment analysis).
They’re also immediate, operating at the speed of direct, real-time, human-to-human communication — or faster.
Also, like human-to-human communication, these new channels are rich in context and come with a layer of metadata that captures information about who is communicating what, where and why.
For marketing managers, product managers and those responsible for managing communication, maintaining a level of control over these channels, even in a single language, represents a significant challenge compared to 20 years ago. It’s a universal problem that has given rise to new classes of tools and technologies that address content management, marketing automation and sentiment analysis.
Finally, multiply this complexity by the fact that not all channels are equally favored around the world, they operate in multiple languages, and traditional localization processes were never designed to address their high-volume, immediate and omnidirectional nature.
How can you possibly manage so many channels now that you’ve even further multiplied the challenge by including dozens of extra languages? Now you have the makings of a proper crisis.
Deepening the crisis
And then we bring in audiovisual content.
Both localization technology and content management technology have largely evolved under the assumption that content equals text. Multimedia content has been with us for a while, but until recently was classified by many as an outlier.
Not anymore. A content globalization strategy must account for audiovisual content. Exhibit A: YouTube.
As of March, more than one billion hours of content gets consumed on YouTube every single day. And before you think that it’s all about adorable cats, keep in mind that YouTube has increasingly become the “go-to” place for people looking for product support or making buying decisions. It represents a significant channel of communication about your brand, whether you actively participate in the conversation or not (overlooking YouTube makes a huge brand statement in and of itself).
It is the quintessential modern channel: omnidirectional communication, instant, massive volumes, open participation and metadata richness. But it’s a new and unique type of media: the core message is encapsulated as a combination of moving pixels and audio waveforms, not the text that traditional processes and tools were designed to deal with.
Why am I raising this red flag? Because as more and more communication gravitates to audiovisual channels like YouTube and the world continues to flatten, unaddressed language barriers make this trend a liability to you, and could be what stands between you and realizing your full global potential.
A typical subscriber enjoying the September 2027 issue of MultiLingual magazine.
Time to think differently about global content
It’s a false idea that a single product or one simple thing that you can do differently will neutralize the challenge. The right practices and the right application of technology are crucial, but making meaningful progress needs to start with our worldview, beginning with how we think about global content.
Let’s begin with some core assertions that are at the heart of any advice I may have for you about content globalization.
There is no such thing as a “foreign market.” The concept of foreign markets is a preglobalization holdover from an era when businesses were born and raised within the walls of a single nation-state, only later to expand abroad as a logical next step for growth.
Think about the McDonald’s restaurant enterprise, which started in California in the 1940s and was an exclusively American phenomenon for 20 years. It wasn’t until 1967 that McDonald’s opened its first international franchises in British Columbia and Costa Rica, but when it did, it opened the floodgates for massive growth.
There is nothing inherently wrong with the “start locally/expand globally” approach, but in today’s landscape it’s artificially limiting. There are very few arguments in favor of waiting for a future stage to become a global company, and plenty of arguments against.
First, some business models don’t work unless they operate at a fully global scale. Can you imagine how much less useful businesses like Airbnb and Uber would be if they only worked in one country? Or what if your credit card or smartphone didn’t work practically everywhere in the world?
Second, if you wait to do business across geographic borders, you’ve given a head start to in-country competitors that already have home-field advantage. You’re also giving ground to global competitors that can wield a significant competitive advantage over you if their operational reach already transcends borders.
So, look at your global content strategy. Is it a start local then expand global business model? Are you creating your messaging with a “home market” in mind, and then adapting it for “foreign markets”? If so, your content strategy is probably a liability to fulfilling your global potential.
Don’t go global. Start global in everything: your product design, your business model, and your content strategy.
Your content does not equal its form. If you ask customer support managers what content is, they’ll likely describe help systems, manuals, white papers, press releases and websites. If you ask marketing managers, they’ll add blog posts, product descriptions, tweets, newsletters and more.
Ask other types of professionals and your list will grow. You might be tempted to conclude that content is a collection of a lot of different things. But I would assert that it isn’t, and that thinking about content in this way is in fact problematic.
The various forms in which we interact with content represent renderings — permutations of content in various forms that have been optimized for the channels and contexts in which it gets used. Content in its raw state — decoupled from its many forms — can be more closely described as an idea or a message.
Thinking about content as being separate from form unlocks natural efficiencies and promotes flexibility and scale.
This isn’t a novel idea; it’s a concept that’s already proven itself, being one of the driving principles behind the development of CSS (Cascading Style Sheets) and XML (eXtensible Markup Language). The theory goes that by separating content from presentation and delivery, each of these facets can be modified independently. Rewriting a paragraph of text, for example, doesn’t require reworking the layout. Also, content can be written once and reused in many different places.
It was this vision that fueled the single-source publishing/component content management movement that rescued the world from a similar crisis 15+ years ago. It’s hard to imagine, but prior to that, if your messaging needed to change, that change would need to be manually implemented in every permutation of your content: documentation, website, help systems, press releases, advertisements and so on.
If you had localized versions of that content, those costs (including time, money, and resources) would be multiplied by however many different languages needed to be maintained.
As you might expect, the net effect of all of this had a chilling effect on change, and inhibited the localization of messaging that would have otherwise been well received by a global audience. All because of cost. “Too damn expensive.”
Fortunately, the principle to separate content from form was implemented into content production and localization practices, and global content was flowing again, fluidly and affordably.
How is this relevant to the audiovisual content globalization crisis? Because audiovisual content was largely left out of the single-source publishing/component content management revolution, and it’s still handled like an outlier.
Today’s customer support managers and marketing managers will certainly include things that can be described as “multimedia” in their descriptions of content — videos, infographics, podcasts, webinars and so on, but their systems and practices tell a different story. Audiovisual content is treated differently from traditional content. It’s why DAM and CMS (content management systems) are separate acronyms representing separate paradigms, and often separate ecosystems and practices.
There are very few examples of audiovisual content that has been designed for reuse across channels and optimized to be useable worldwide. Why is this?
A lot can be explained by our legacy thinking that assumes that content in its raw state (the idea) still equals text. The basic building block of content management systems that strives to decouple content from form is still the topic, which is still a string of text in a single language. This is a model we’ve outgrown.
If we’re going to bridge the content globalization divide for audiovisual content, then we need to rethink that ontology: text and language are just characteristics, or dimensions of content’s form. Its raw state is still the idea or the message.
This is important because just as the rise of single-source publishing had a liberating effect on the globalization of text-based content, applying its core principle of decoupling content from form can provide the same benefits to audiovisual content too.
Your content is a potentially rich data asset. If you believe that information is power and power is wealth, then your content is a gold mine waiting to be excavated.
We previously asserted that today’s modern channels come with a layer (or metachannel) of data that tells you who is communicating, when and where. What kind of person they are. How they reacted to the content. How they found the content in the first place.
Audiovisual content is, by nature, especially rich in contextual data about the message itself. You can tell quickly if it is humor, academic or the rantings of a crazy person. You can usually extract a lot of context from just a few seconds of audiovisual data because it mimics what our brains decode incredibly quickly.
This universe of content metadata exists whether we choose to make good use of it or not. So much data is just thrown away as a byproduct of processes. But if we choose to think like analysts and deliberately capture and analyze it, we can start to do some interesting things.
For example, we can unleash hidden correlations between content characteristics and its performance out in the global market, giving us actionable insights into what we should be doing differently or what we should do more of.
The holy grail of this type of intelligence is of course predictive power — the ability to anticipate what outcome your actions are likely to produce. The campaign isn’t expected to be well received in Indonesia? What if we change this?
(Incidentally, from an information science perspective, organizing content within an ontology that considers language and form to be separate data dimensions makes these kinds of analyses easier).
For audiovisual channels, the application of such business intelligence to a global content program can mean a re-empowerment of production and localization budgets, since money only need be spent where it’s predicted to pay off.
From vision to action
Used to drive a comprehensive content globalization solution, the above principles reinforce each other and help evolve operational capability in a landscape that desperately needs it.
Today, managing worldwide communication across all channels calls for an integrated approach. Remember that there is still only one global market, even if it uses different communication channels and makes use of different languages. Your solution needs to be able to join the market wherever it hangs out and communicate in all its languages, in both text and non-text forms.
Since new channels are immediate, if you wait to address culture, language and format barriers until later, you risk being left out of the conversation altogether. Your solution must address the problem of language barrier at the speed at which communication takes place in the channel.
Audiovisual content has become both increasingly important as a channel, yet has been excluded from advancements in content globalization. Your solution should neither exclude audiovisual content nor try to handle it “off the grid” from the rest of your global content.
Fortunately, the content globalization crisis represents an opportunity, not only for audiovisual media but for all content. If our practices and systems deliberately capture and make good use of content-oriented data, we have all the makings of a comprehensive solution. One that is suited for today’s complex, global, multi-channel, fast-moving landscape.
And it’s our prerogative to seize it. Borrowing from Google’s mission statement, there’s a big difference between “all the world’s content universally accessible” and “some of the world’s content, mostly in text form, available mostly in English.”
The following “Compton Classic” article was published May 2017 on the now defunct Moravia Blog and is being re-published here for posterity. Minor edits may have been made to support relevance and clarity.
Digital files have served us well, having helped bring us the digital age. But now that we’re older and wiser, do we still need them? And should they be allowed in our content programs? In this post I propose that the digital file—with its inherent limitations—may now be more burden than boon.
It all starts with metaphors
Metaphor is a constant force in the history of innovation, and is a powerful thing. Not only can it trigger innovation (Where else can this technology be applied?), but it profoundly influences the design process (Where has this problem been solved already?).
Metaphors serve as gateway concepts that enable us to leave the old world for the new without suffering too much cognitive dissonance. When something is too foreign for our brains to comprehend, metaphors anchor the unknown to that which is already known, allowing us to focus on classifying the novelty in a way that is generally consistent with our world view.
“Is this something that I can eat, or is it something that will try to eat me?”
It’s the concept behind the design technique skeuomorphism, which attempts to make software look and feel like its real-world counterparts so that it feels more natural. You may have used e-readers with “page curl” effects or digital watches that have hands. Or, if you’ve made electronic music, used virtual instruments with chrome-plated rotary knobs and wood grain.
These are all applied metaphors. They have a purpose, a time, and a place. But as the unfamiliar becomes mundane and their comforting quality becomes unnecessary, they also have a shelf life.
The rise of physical office metaphors
Electronic computing is by all measures a revolutionary concept, so it isn’t surprising that as computers were sold and marketed to a wider audience, that this process leveraged lots of metaphors, especially physical office metaphors, including the desktop, mail, folders, trash cans/recycle bins, and of course the ubiquitous document, otherwise known as the file.
In the physical world, files are quite useful. They represent a clearly-scoped, unambiguous container for content and information. They can be opened any number of times, duplicated, stored, marked-up, moved-around, organized, shared with others, and if necessary, destroyed.
These are all things that we’d like to do with content in the electronic world too. And since digital files can be processed far faster and at greater volumes than what is possible in the physical world, what’s not to love, right?
“…though we may glorify it with the name ‘common sense’, our past experience is no longer relevant, the analogies become too shallow, and the metaphors become more misleading than illuminating.”
He argues that analogies not only project characteristics upon the “radical novelty” that aren’t true, but they can deny the thing its true potential.
Looking at the world through Dijkstra’s lens, it is easy to find examples of metaphors that have broken down, imposed unnecessary limitations in the space they’re supposed to empower, or have turned the tables by requiring the user be in service to it.
Look no further than your smartphone—which opted-out of many of the office metaphors previously taken for granted. Would your phone be more useful if it had a virtual desktop and trash can? Would it save you time if you could put virtual pieces of paper into virtual stacks and organize them into virtual folders?
Files’ strengths become their weaknesses
Digital files, as an applied metaphor used to manage information, impose unnecessary limits on information management. In the context of new information technologies, those characteristics which were strengths in the physical world become weaknesses in the digital world.
The “clearly scoped unambiguous container” problem
The information that lives in a file is static in a world where the demand for information is dynamic.
If I have a report, for example, and I want additional information that isn’t in the report, a new report needs to be generated. And since the file is a snapshot of the data, by the time it makes it to my eyes it’s already old news.
By comparison, systems that forgo the file container and use more database-like techniques can serve us by creating on-demand renderings of the real-time data based on whatever criteria we’re interested in.
The portability problem
In a world where systems can talk to each other directly, the flexible portability of files invites trouble. If data must go on a road-trip—riding in a file—it risks being subject to corruption, obsolescence, hijacking, and getting lost along the way.
If it’s a human who is playing host to the file, he must make a home for it so that he can find it later (which is also known as the “thing that needs to be physically put away into a folder” problem).
In the context of sophisticated search algorithms which can find and retrieve information regardless of where it lives, this requirement to organize files—inherited from the metaphor of the physical files—has become an example of “the tail wagging the dog.”
Process waste
In certain business-to-business traditions, localization included, files have been used as an interchange format as way to exchange information between otherwise independent systems or independent steps within a process.
But from the standpoint of overall process efficiency, using files in this way represents several classes of muda (waste, inefficiency), including entire steps dedicated to:
changing the file format
running quality control to make sure something didn’t break when the file format was changed
hand-offs and hand-backs, version control, file-splitting, etc.
The non-value-add costs add up quickly.
Alternatives to the file
In the world of building globalization solutions, the practice of processing content through files has been universal. And measurably costly. But things are changing.
When designing solutions today, if there is a file in the mix, we’re asking, “Is this necessary? How much will it cost to create and manage this file?”
Instead of bringing the content to the process (in the form of files), we’re looking at ways to bring the process to the content using web services, browser extensions, and other techniques that don’t force the content to go on long, expensive, and unnecessary road trips.
And this seems to be a trend.
The XLIFF-OMOS project is an attempt to make the XLIFF format (which aims to standardize the way localizable data is handled by localization systems) useful in a world without files.
“A top priority [of the project is to] unleash XLIFF 2, make it usable outside XML pipelines, and ready it for real time cloud services”
XLIFF-OMOS, you’ve got my attention!
Your Opinion?
What do you think? Are files a boon or a burden for your program? Something else entirely?
The following “Compton Classic” article was published November / December 2018 in MultiLingual magazine and is being re-published here for posterity. Minor edits may have been made to support relevance and clarity.
At the 2018 Globalization and Localization Association (GALA) conference in Boston, I was part of the “Rise of Big Data in Localization” panel. It was an interactive session intended to spark discussions about how big data, machine learning and AI trends are shaping the globalization and localization industry.
It’s a broad subject that I encounter a lot at industry conferences, often in the form of the “Will AI replace humans?” debate. Our GALA panel wanted to take a slightly different approach. We wanted to ask the audience some provocative, open-ended, yet ultimately personal questions about their take on big data and AI.
Those interested in the subject tend to fall into a few different categories. There are those who are suspicious and fearful of AI; those who directly develop the technology; and those who are optimistic about its potential and keen to maximize its practical use, even if they’re not entirely sure what that looks like. They’re seeking those “killer applications” of AI that are both relevant to their business and obtainable.
As part of the third group, I have an interest in learning as much as I can from the practical experience of others. And I believe that there’s a lot that the globalization and localization industry can gain by sharing ideas, challenges and experiences. The question is: how can we use AI to solve practical problems and improve how we work?
Here I’ll pose to you the same questions that I asked the folks at GALA Boston, as well as share some of my own ideas. My hope is that we can routinely have these kinds of discussions about AI, and that through such “shop talk” we can standardize its use and maximize its benefit throughout our industry.
Incidentally, making good use of AI has always been part of the globalization and localization industry’s culture. Even excluding machine translation, which has been used on projects since the 1960s, we’ve been using translation memory, concordance searching, terminology recognition, optical character recognition (OCR), quality assurance checking and other applications of AI for some time.
Continuing to take positive advantage of AI technology is our prerogative.
Provocative questions
I posed two questions to the GALA audience: one that I wrote myself, and the other I stole from a vintage IBM advertisement for a punch card calculator.
The first was, “What are the killer applications of artificial intelligence for us?” (with “us” being deliberately open to interpretation). The second was, “What would you do with 150 extra engineers?”
1951 ad for the IBM Electronic Calculating Punch. What would you do with 150 engineers?
I find the latter question to be timely, both in general and in the context of the globalization and localization industry — especially if you expand the question to include translators, project managers, copywriters, solution architects, account managers and other roles that depend on finite supplies of human expertise. You could alternatively ask, “What would you do if human resource limitations didn’t exist?”
I wasn’t around in 1951, but I can imagine reading this ad (Figure 1) and being inspired to brainstorm about the future. I like this type of “what if” question because it forces you to decouple the task of deciding what you’d like to accomplish from the task of figuring out how to make it happen. The latter shouldn’t overly influence the former. This is especially true when talking about applying computer technology to problem solving — it can execute in microseconds what it takes humans minutes to do. Through the application of technology, ideas can and routinely do go from fantasy to reality.
In 1951, IBM delivered pure computational firepower with their room-sized punch-card machines. Today, artificial intelligence offers more complex and nuanced capabilities that go beyond number crunching. With AI, we can reasonably expect that not only engineers, but all globalization and localization roles could potentially experience a 150-fold growth in efficiency.
So, 67 years later, it’s time for us to ask ourselves this question again.
The globalization and localization industry has a profoundly important mandate: to remove language barriers that impede world progress. We’ve made impressive strides since 1951, but with the continued shrinking of the world, the rise of digital communication and other global megatrends, it’s become a harder problem to solve.
Can advances in AI capability help us solve these big challenges? Obviously, machine translation (MT) will play a huge role, but what else?
Someone in the audience asked me, “What do you think the killer applications for AI in our industry are?” So, I’ll share my thoughts here.
What would Jim do with 150 extra engineers?
Don’t take the question too literally; I see it as a metaphor for the automation of complex human tasks.
I start my brainstorming by drawing inspiration from my own professional experience — especially pain suffered, or instances where I’ve been face-to-face with situations that seemed hard to resolve. When do I remember recognizing something wasteful, or having to accept a compromise on a solution because of some human-based limitation?
Here are a few real-world examples of painful experiences from my last twenty-plus years at various globalization and localization companies:
1. Valuable transactional data lives in a set of folders that gets archived after the completion of a project, never to be seen again. When similar projects are executed later, the same lessons have to be learned again and again.
2. A client gets frustrated because a price estimate that was quoted to them for a project seems inconsistent with the price that was previously quoted for a similar effort. Looking at the details of the quote, it’s discovered that significant parts of the effort have been estimated with a “finger in the wind” approach.
3. An account is operating smoothly, and the client is happy. In large part this is thanks to the efforts of a long-time veteran of the projects who has a nuanced understanding of how they should operate. But when that person quits the company, errors start popping up and the client stops being happy.
4. A large team is assembled in preparation for an expected upcoming big project, but the client changes its plans and these resources aren’t needed after all. So, they’re either redeployed elsewhere or let go. Three months later, the client again changes its plans and the project is on again, this time as an urgent priority.
5. An opportunity presents itself as a unique client challenge, and a pursuit team of solutions architects and subject matter experts convene and develop the ideal solution over a period of weeks. Later, through a casual conversation, it’s discovered that another pursuit team had developed a different solution for a very similar opportunity a few months prior. Some of the elements of that solution are actually better.
You may have had similar experiences. What I see them having in common is that they could all benefit from applied intelligence — from the ability to better recognize a pattern, understand cause and effect, predict a future quantity, identify something — but for which “throwing experts” at the situation would be expensive and likely create its own problems.
How might AI be able to help with these situations?
“Alexa, play Animal Game”
My kids love to play a game on our Amazon Echo called “Animal Game.” Alexa has you think about a specific animal, asks you a series of mostly yes or no questions about it, and then guesses which one you’re thinking about. It goes like this: “Does it eat leaves? Can it climb trees? Does it live underwater? Is your animal a giraffe?”
It’s a version of the old Twenty Questions parlor game that any two people can play. Computer-based variants of the game exist in the form of dedicated websites such as 20Q and Akinator, or in highly specialized “profiling” quizzes on social media that can tell you what Star Wars character or 80s metal band you most resemble.
I find this approach — asking a series of quantifiable questions in order to identify something mysterious by name — as having real-world relevance. If you know the name of something, you are better empowered to deal with it. For example, by identifying that the spider you found in your garage was Eratigena atrica, you know that it isn’t poisonous. By knowing that your itchy eyes are the result of allergic conjunctivitis, you can take an antihistamine and skip the topical antibiotics. In combination with access to collected information about it, knowing something’s name is quite powerful.
“Rumpelstiltskin”, by Henry Justice Ford, from Andrew Lang’s Fairy Tales. Rumpelstiltskin understood the value of learning the name of something mysterious.
For several of my experience examples, but especially for scenario #5, I think having some kind of Animal Game-like system in place would have helped immensely.
I can imagine an alternative timeline for that scenario, in which the pursuit team first stops and asks, “Have we seen this sort of situation before? And if so, what do we call it?” Using the system, they would have identified that the scenario was in fact not unique. It already had a name, and was associated with some existing best practices and technologies.
“Folks, it looks like we’re dealing with a giraffe here. It typically lives between 15-20 years. Every day, it only needs a half-hour of sleep, but eats up to 75 pounds of food. We’ll need to place its meals high up. We have a platform that we constructed for just this purpose.”
From the standpoint of solutions architecture, being able to differentiate between a “giraffe scenario” and a “reindeer scenario” is a prerequisite to being able to effectively serve either one, allowing us to make use of our animal-specific assets, including our know-how.
That’s a capability I’ll call opportunity profiling, and it’s one of the things that I’d do with 150 extra engineers. How would I start?
Finding an AI solution
Navigating the rich landscape of AI technology isn’t that easy if you aren’t immersed in it. Soon there will be Animal Games to help you identify the best AI solution for your use case, but in the meantime, I will share with you the process that I used when trying to create a working example of this profiling capability.
The world of AI options is about as diverse as the world of technology options. There are dedicated commercial products, commercial products that contain AI features, SaaS systems, API-based cognitive microservices, open source toolkits, and of course you can always try to home-grow a capability. The maturity, costs and level of involvement required to operationalize these options cover a wide spectrum.
There’s not just one kind of AI either. Capabilities can be subdivided both by domain (self-driving cars, video games, natural language processing, image recognition, virtual assistants, industrial process management) and by AI approach and worldview, for which there are almost as many variations as there are genres of music.
Deep learning (a branch of AI), for example, finds correlations between inputs and outputs through a process of layered statistical analysis. It’s capable of making predictions based on naturally existing relationships between things that would be difficult to model using traditional regression techniques, or that involve characteristics that are hidden yet statistically relevant. It represents a world of extraordinary possibility, but requires powerful computers and a framework for capturing and processing big data. This possibly includes dedicated data centers.
Not long ago, deep learning was the exclusive domain of AI researchers, but tools like Google’s TensorFlow have lowered the barrier to entry by releasing its libraries as open source. In combination with leasable cloud computing platforms, an organization can set up a deep learning capability with less capital investment than ever before. Of course, there is still a barrier to entry as organizations need the expertise to operationalize the technology — experienced AI engineers are in short supply. And they also need lots of relevant data. Companies that manage their data like an asset, and have a great deal of it, have a competitive advantage over those who don’t.
For my opportunity profiling problem, deep learning would be overkill. Instead, I worked with my developer colleague Alexander Sádovský to research how other Animal Game-type systems worked, and we landed on the technique of the “decision tree classifier.”
Using the general approach, Alex set up a working example in just a couple of hours.
Here’s how it works in a nutshell: you have a set of data that includes a certain number of “classes” (discrete things that we have names for, such as giraffes and reindeer), and a list of core characteristics about each of those classes that collectively make them unique (is an herbivore, has antlers, has a long neck, is found in Africa and so on). There should be at least enough characteristics such that no two classes share the exact same set.
The point of asking the questions is essentially to narrow down possible classes. Each question should reduce the current set of possibilities by roughly one-half — a process called “binary searching.”
Because of the power of exponents, it doesn’t take that many questions to reduce the possibilities down to a single item from a huge set. Through twenty questions, you can classify a single candidate from a set of over a million possibilities (2^20). If you include non-yes-or-no questions, you can address an even bigger set.
With opportunity profiling, there aren’t a million different classes. There are maybe twenty different archetype situations that we run into, meaning our system should be able to get there in five questions.
(If you search for “decision tree” you can also find tutorials online.)
Now what?
My plan now is to run a limited pilot with our solution architects group to determine if this approach is generally useful and worth developing further. We’re going to first build a data set using a record of historical projects and their characteristics, then see how a junior architect (or someone with no expertise) using the system can identify the class of an existing opportunity compared to an expert.
If that works, we can point the system to our existing repositories of solutions intelligence to ascertain how best to serve these opportunity profiles: what the ideal team looks like, what’s an ideal configuration of tools and processes, and what risks and pitfalls the operations team can expect.
I will want to know: What effect will this process have on our key performance indicators created during the solution design process? Can we bring better solutions to the table more quickly and cheaply? How accurate are our predictions against their operational reality?
If the process brings measurable value, I’ll want to keep improving the algorithm. In the world of business problems, opportunities are constantly evolving. New classes of opportunity are born, and existing classes evolve. I can envision the system being improved to include an interface that allows users to add new classes of characteristics, and to bring some supervised machine learning into the mix, allowing experts to provide feedback on the quality of the results.
Conclusions
The globalization and localization industry finds itself in a fascinating place right now. We’re at a confluence of technological trends, including advances in artificial intelligence, that collectively open up a world of possibility. Simultaneously, as the world is more connected digitally and communicative, the problem space has become more demanding.
Like other technological trends, advancements in AI will exert different kinds of disruptive influence on many industries, including ours.
One effect will be increased access to AI technology. We’re already seeing this now in the trend of AI companies like Google, Microsoft, Amazon and IBM competing to bring cloud-based cognitive services platforms to businesses at the cost of less than a tenth of a cent per transaction. A phenomenon like this should have a “playing-field leveling” effect on industries.
At the same time, AI will magnify the differences in capabilities between organizations. The most advanced AI techniques thrive on having lots of data, and those that deliberately collect and manage data like a business asset will have a significant advantage over those that don’t.
The ability to operationalize AI will be another differentiating factor, and this is where I believe that the globalization and localization industry has a reason to get excited. Many of the decades-old practices we take for granted today are examples of applied AI; we can rightfully say that we’ve been using AI “before it was cool.” Translation memory and machine translation are two prime examples of AI in action in our industry. So now that AI is becoming more powerful and accessible, what else can we do with it?
I like approaching this question from different angles. Wearing my “big blue sky thinking hat,” I can imagine what the world would look like if we weren’t at all bound by natural laws. I find tremendous value in that exercise, but I also find value in a more pragmatic approach, looking at our existing world through a lens of process improvement and incremental innovation.
Once we’ve generated some ideas, it is important to get our hands dirty with AI and start trying out solutions. It may get messy, but that’s okay. Innovation is an imperfect and iterative endeavor, but the potential gains from applying AI to our industry’s real-world challenges are so great that we must engage.
The following “Compton Classic” article was published in 2018 on the now defunct Moravia Blog and is being re-published here for posterity. Minor edits may have been made to support relevance and clarity.
The term “composability” has largely remained exclusive to the IT domain, but it’s a concept that has broad relevance, including to the GILT (globalization, internationalization, localization, and translation) industries.
Compos-a-what-a-bee?
If you don’t have at least one foot in IT, you may never have heard the term “composability” before, which Wikipedia describes as a system design principle. I’ve mostly seen it used in the context of designing IT infrastructures.
Its key principles are modularity and reusability, but the heart of the concept can be found in the word itself. To compose something is to give it form through the assembly of components, and the extent to which something supports the creation of new things through these components is its level of composability.
When you start to look at the world through this lens of creation through the assembly of components, you see examples of composable systems all over the place.
The go-to analogy out there is LEGO, and it’s easy to understand why. Anyone who has ever played with Legos has a good sense of the concept already. Through a set of different-but-interoperable pieces, you can create any permutation of object—or scenario—you can imagine.
It’s an idea explored in the movie “Big Hero 6” through the plot device of the Microbots, a set of “neurotransmitter controlled,” magnetically connectable pieces that can create dynamic, architecturally complex structures (which ultimately get stolen by the villain and used as his superpower).
Composability is also central to the IKEA business model that sells modular pieces of furniture that can be bought and arranged together to form comprehensive groupings.
And it’s an underlying principle (although not by name) in Adam Smith’s The Wealth of Nations, which asserts that economic growth is driven by increased division of labor in both manufacturing systems and economic systems. At the manufacturing level, the components are the tasks required to produce something. Within economic systems, the components are the individual specialized industries that can be traded and combined to create new value. Our modern economy is a living example of this principle in action.
The (r)evolution
Not all systems are equally composable, however. In the world of IT, the opposite of a composable system is described as a monolith.
If monolithic systems can even be described as having components, their components are bound together. Nothing takes place outside the monolith; everything is built-in. They’re big, standalone, all-in-one, indivisible entities. There are arguments in defense of monoliths, but the case against them is that they’re inherently inflexible, slow to evolve, and difficult to scale. They’re tailor-made to a problem until the landscape changes completely—at which point their all-in-one-ness becomes an impediment. The worst examples are disparaged as “balls of mud.”
The reaction against monoliths has influenced several notable technology movements:
The Service-Oriented Architecture (SOA) movement – SOA sees the natural component within a modular system as being the service: a discrete, specific piece of specialized functionality that can serve a variety of systems and contexts, including other services. There are parallels with the Unix philosophy, which advocates small, modular, interoperable applications.
The SOA movement has been highly influential over the way companies are designing or re-designing their technology stacks, and to the adoption of enterprise service buses. Related, but not identical, is the microservices architecture movement, which has additional notions about how modularity should be implemented.
The “Programmable Web” movement – I see this as the SOA movement applied inter-organizationally. The programmable web philosophy sees all the APIs available across the internet—from separate hosts—as part of a holistic programmable system from which new capabilities can be composed.
It’s a movement that’s spawned the practice of “mashups”: the creation of useful applications through the combination of APIs (for example, combining Craigslist apartment listings with Google Maps to create a visual map of available apartments).
“The Wealth of the Nation” by Seymour Fogel is in the public domain. What new value can be created by leveraging the wide world of API-accessible capabilities?
Moravia’s response
At Moravia, we’re excited about these movements, since they’re compatible with our worldview: we define a “solution” as a composition of strategy, process, technology, and human involvement, and we employ solutions to meet our customers’ needs, solve their business challenges, and provide value.
The efficacy of our solutions has always been relative to their composability—the extent to which components need to be developed from scratch, capabilities can be reused as-is, and pieces can be replaced or updated by better ones. As our customers’ workflows evolve (sometimes radically), the ability of our solutions to quickly conform to the shape of the problem (not unlike a cloud of Microbots) is incredibly valuable.
And we’ve embraced the pro-composability revolution in several ways:
We’ve been migrating to a service-oriented architecture within our own tech stack. We recently developed and deployed the Moravia Service Layer through which all our internal tools can communicate, and have been generally migrating from a products-orientation to a services-orientation. Through the Service Layer, we remove the functions that are common across tools and make them available as shared services.
For a while now we’ve been “API-ifying” our tool stack, making sure that our tools’ core capabilities can be flexibly invoked from anywhere.
We’re leveraging the programmable web and API economy ourselves, utilizing third-party APIs where they can add relevant capabilities to a solution.
We’re doing these things because we see advantages for us and for our customers, both big and small. Any time we can reuse a capability that’s already available as a component—as opposed to having to create it from scratch—we save time and money.
More broadly, having a rich, highly-composable capability stack with components that we’ve either created ourselves or have leveraged from others is a liberating force for creative problem-solving. The more specialized components we have in our stack, the more we can assemble them to create new solutions to increasingly complex globalization problems.
The more we can focus on solving our customers’ big-picture globalization challenges, the morerelevant and valuable we are to them. Everybody wins.
What are your thoughts on composability and its relevance to solving globalization problems? Are you making changes in the context of the composability revolution too? Let’s start a dialog!