DMM Tokyo Machine Learning Tech Talk

The Event

The Event

DMM.Make office at the Fuji Soft Akihabara Building (orange-lighted tower).

In February, I spoke at the DMM Tokyo Machine Learning Tech Talk. The audience included Japanese companies and academics who wanted to:

  1. Gain insights into how machine learning can be used in their businesses, even if they are not directly working in the computing/IT industry (e.g. Fashion)
  2. Learn advanced techniques and best practices in machine learning to increase the prediction accuracy of machine learned models

There was also a hands-on session where attendees could apply machine learning concepts and best practices with the Azure Machine Learning tool, which has a graphical user interface, making it easier to learn about machine learning using minimal programming and math.


The Venue

The event was held at DMM.Make, a subsidiary of Japanese conglomerate Digital Media Mart (DMM). The office space was located in Tokyo’s Akihabara district – Japan’s high tech cultural center and shopping destination for video games, anime, and computer goods.


DMM's fashion eCommerce website showcasing different women's dresses for rent


Event Keynote & Host


Chun Ming Chin, Keynote Speaker

Chun Ming leads a team of engineers and Microsoft Research on machine learning problems. His team's work is used on Microsoft products such as Bing. Previously, he was the Chief Technical Officer for a computer vision startup. You can read more about his story here.


Event Keynote & Host

Masayuki Ota, Technical Evangelist, Microsoft Co Ltd. Japan

Masayuki is with the Microsoft Japan Evangelism team, educating engineers on Microsoft products such as Azure and Kinect. He is passionate about technology, has competed in several IT hackathons and has won numerous awards. You can read more about him on his blog.

Now let's dive into what we covered at the event.


Why does machine learning (ML) matter?

We wanted to show entrepreneurs—especially those who are not working in the computing/IT industry—how to bridge ML capabilities with their business needs. ML can help businesses ranging from manufacturing to fashion and beyond. Here are some of the things ML can do:

  1. Increase the barrier for entry into your business if the quality of your product/service is dependent on the on the quality of the data you collect from your users.
  2. Increase personal productivity by learning and automating abstract human work without being explicitly programmed.
  3. Personalize product recommendations to increase sales and customer engagement on your website. For example, in fashion eCommerce, two shoppers may not like the same clothing design because different shoppers perceive beauty differently. Machine learning can be used to display the most beautiful clothing designs to a shopper that is relevant to a her personal sense of beauty.


One of DMM's managers showing a personalized jacket design made by a 3D fabric printer that use different colored threads and computer aided design software as input


How does ML work?

When represented in a 2 dimensional graph plot, the essence of ML is to find a line (i.e. Decision Boundary) that separates different data types (i.e. Classes) in your training set. For example, let's say you want to predict if an image contains Japanese or Chinese characters (i.e. 2-class classification). Each image becomes a data point you represent in your 2 dimensional graph. To specify its position in the graph, you use specific characteristics of the images, such as the number of straight lines and the number of pixels in the image. Such characteristics are known as “features” in ML speak. Features are an important concept in ML because you can use hundreds or even thousands of features to represent an image.

Next, consider a case where your images are labeled beforehand by human judges (i.e. Supervised ML) instead of unlabeled (i.e. Unsupervised ML).

How does ML work?

There are 2 stages in machine learning:

Training Stage

  • Use data that was previously labelled by human judges (i.e. training data) to distinguish between different classes
  • With the training data, use a training algorithm (e.g. Deep Neural Networks) to train a model that represents the decision boundary line

Testing Stage

  • Use the model output from the training stage to classify new, unseen data points (i.e. testing data) that has not been labeled yet (i.e. predict the class of the data point)

ML Challenges

The biggest challenge in machine learning is to get high prediction accuracies. A typical problem is that you can train a model that classifies your training data at 100% accuracy, but has poor accuracies on your test data. This is known as the generalization problem because the decision boundary you trained does not generalize well for your testing data. There are 2 types of generalization issues:

ML Challenges

Under and overfitting in optical character recognition scenario


Underfitting is a scenario in which there are underlying patterns in your data that the decision boundary is unable to fit nicely. For example, your data cannot be separated using a straight line (i.e., non-linearly separable data) but the decision boundary insists on fitting a straight line. Therefore, even if you have an infinite amount of training data, the decision boundary will still fail to fit the non-linear nature (e.g., quadratic structure) in the data. Underfitting can be caused by using a small number of features to represent each data point.


Overfitting is the polar opposite of the underfitting problem. With overfitting, your decision boundary is fitting detailed patterns in your data rather that capturing true underlying trends. This can be caused by using large number of features to represent each data point.

ML Solutions

ML Solutions

I’ve listed some strategies to solve prediction accuracies below in descending order of effectiveness (i.e., training data improvements have the most positive impact on accuracy, in my experience):

  1. Training data improvement
    • Get more training examples.
    • Ensure training data is high quality.
  1. Modify objective function
    • The target that you are optimizing your decision boundary towards (e.g., minimize least squared error in classification) is your objective function. Finding a target that is more correlated to the final metric you want to optimize towards can improve your results.
  1. Feature engineering
    • Increase the number of features (e.g., x3 = x1 x x2) to fix underfitting. For example:
      1. Use the product of features as derived features in training.
      2. Convert continuous features into categorical features.
      3. Create a new feature as an indicator for missing values in another feature and supply a default value to the missing feature value.
    • Reduce the number of features. You can select only the most important features for training using feature compression techniques like Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA)
    • Change the features to fix underfitting. For example, to fit a decision boundary for non-linearly separable data, use non-linear features (e.g., for original feature x, add derived feature x^2).
  1. Improve optimization algorithm
    • Change the optimization algorithm used. Below, you’ll a few popular ones often used. To ensure the best results, you have to fine tune the training parameters associated with each kind of optimization algorithm:
      1. Support Vector Machines
      2. Decision Trees (Parameters: number of leaves, number of trees, learning step, etc.)
      3. Neural Networks (Parameters: number of hidden nodes)
    • Run the optimization algorithm for more iterations to ensure that it converges.

Azure Machine Learning is a great tool for learning about ML concepts and best practices.


Why use Azure Machine Learning?

Azure Machine Learning has a graphical user interface, which makes it easier to apply machine learning to your projects using minimal programming and math. Here are some useful functions that will make your ML experimentation more productive:

  1. Rich Optimization Algorithm Library. Experimenting with different algorithms can be tedious because you need to find out what is available and prepare your training data in specific formats. Azure ML has at least 20 different classification/regression optimization algorithms you can try without any programming.
  2. Productive Auto "Sweep Parameters" module. The problem with manually tuning parameters on your optimization algorithms is that it can take a long time before you find the best combination of parameter values. To solve this, Azure ML has a "Sweep Parameters" module that lets you perform an automatic parameter sweep on the model to find the optimum parameter settings.
  3. Integration with other popular ML libraries. You have the flexibility to use Python or R in your Azure ML models.
  4. Quick Deployment to Production Environment. I was able to wrap an API around my Azure ML model and deploy it in the real world with 1 weeks’ worth of man hours.


Selected ML Questions and Answers from the Event

Selected ML Questions and Answers from the Event 168

Q: What are some of the more advanced, best practices in machine learning to get high prediction accuracy?

A: I believe the best machine learning practitioners are those that know how to combine human intuition/wisdom with machine speed/pattern recognition capabilities. You need to have domain knowledge of your business scenario and industry in order to brainstorm features that give you high prediction accuracy. You can also add specific rules to process your input data before the training stage, or your output data after testing stage. Pre-processing your data through a rules-based approach instead of letting the ML model learn patterns through generalization can reduce your training time and improve accuracy!

Selected ML Questions and Answers from the Event 174

Q: What kinds of optimization algorithm should I use? Seems like Deep Neural Networks are really popular now...

A: Having simple models with clear explanations is better than a having a complicated model that is not understandable intuitively. This consideration is important especially during debugging, when you want to find out why a test data point you put into your model is not giving you the desired output. Simple models are those that use linear decision boundaries like linear SVM. My recommendation is to try linear models first to see if your data is inherently linearly separable so that you can avoid complex, non-linear models (e.g. Deep Neural Networks), which have longer training times.


There were some Japanese companies in attendance who are interested in applying machine learning to their business models. One of them was Fril. For women who are dissatisfied with the high prices of high fashion, Fril is a resale marketplace for Japanese girls to shop for designer products at discounted prices.

Featured Projects 186


VideoSelfie lets you record videos decorated with GIF stamps, filters, and music. Its gif stamps use machine learning techniques for facial analysis so as to track a user’s face.

Featured Projects 192

Another software project, built by a Japanese programmer, lets you use your hands and the hardware sensor, LeapMotion, to interact with virtual characters on a screen. The character he used in his demo was a virtual celebrity singer in Japan named Hatsune Miku.






We would like to extend a special thanks to Hiroki Watanabe, the professional translator who is bilingual in English and Japanese. Without him, this event would not have been a success; he enabled the free exchange of ideas between our audience and speakers.


And, of course, a thank you to the DPX team for organizing the event!