27 November 2020 / product hunt

How Not Implement Html Editor

If someone says, "Tell me about a technical failure you made", this is the story I'm going to share. ?

For the Product Hunt Collection project, I needed to implement a WYSIWYG HTML editor. The two choices at that time (March 2017) were Draft.js and Slate. I decided to go with Slate. It was easier to understand and had great examples that fit our needs.

Slate is a sound library and still serves us well to this day. The mistake I made was in the way I applied it to our project.

In the frontend, I wrapped and exposed Slate via:

HTMLInput - React component that encapsulates the Slate HTML editor. Used in forms.
slateToHTML - function which converts the Slate Editor state to a React component displaying its output HTML.

This encapsulation worked quite well.

<!-- HTMLInput in a form -->
<Form.Field name="description" control={HTMLInput} />

<!-- show HTML content -->
<p>{slateToHTML(collection.description)}</p>

Slate represents its state with a big nested JS object called the Editor. For the backend, I decided to store the Slate Editor state directly in a "JSONB" column.

My reasoning was the following:

It will make the code simpler.
- I only need to get the editor's state and pass it to the backend. Then, from the backend, put it back into Slate via slateToHTML.
It will be more secure because we don't deal with raw HTML, and I can validate the JSON structure in the backend via GraphQL input types.
We will have only 1~2 slateToHTML conversions per page, and there won't be a significant performance hit.

It took me a couple of days to build and ship a working HTML editor powered by Slate, but then...

What went wrong

The first bad sign was when we needed to send an email containing HTML generated by Slate. We needed to write a converter in Ruby that turned the Slate Editor state into an HTML string. Now, we had two places where we defined what our HTML Editor capabilities were. ?

Then Slate changed the structure of its Editor state and records in our database couldn't be directly passed to Slate. We used our Ruby converter and extended it to be a translation layer between the old and the new Slate Editor formats. We were doubling our work. ?‍♂️

There were many smaller annoyances like searching in HTML fields or validating based on HTML length. ?

We also needed different places to allow/disallow certain HTML and custom elements. We only validated this in the frontend, because validating Slate structure per field was tricky. ?

This was combined with an increased number of places that used Slate like Comments, Ship Messages, Goals, and others. We started hitting performance issues. ?‍♂️

Those issues started arising one by one in the span of a year. We were in boiling frog mode. Every fix was a workaround and made our system slower and more complex. ?

How we fixed this

We got stuck on an outdated Slate version - 0.19.30. It had a lot of bugs. The latest version at the time was 0.44.9 ?

During this time, we were working on Product Hunt Stories and needed to tune Slate even more. David was the person who took the challenge to fix our Slate issues.

The first step was to eliminate slateToHTML and just use sanitized HTML as string. The GraphQL API used the Ruby converter to return strings instead of Slate Editor objects.

David made HTMLInput to accept HTML as a string and convert it to Slate Editor state. With this, our frontend was thinking we deal only with strings, not JS objects.

It was a challenge to handle custom elements. There were a lot of those like the YouTube player, Tweets, gallery, and others. Those are now just simple HTML elements with data attributes:

<div data-component="tweet" data-tweet-id="1321095097883205632" />

Slate can parse those and convert them to React components like SlateTweetEmbed or SlateYouTubePlayer.

A special model concern was added to deal with HTML sanitization and validation:

class SomeModel < ApplicationRecord
  include SlateFieldOverride

  # "mode" defines which HTML tags are allowed
  slate_field :body, html_field: :body_html, mode: :everything
  
  # ...
end

The hardest thing was to migrate the database data. Fortunately, this went without any issues.

Our Ruby converter was battle-tested because it was already used to convert data for the frontend.

David migrated table by table, starting with the less relevant ones and eventually converting all of them.

The code for the migration looked something like:

Comment.not_migrated.find_each do |comment|
  comment.update!(
    body_html: Slate.to_html(comment.body),
  )
end

Having HTML to be stored just as string solved all issues we used to have:

rendering HTML in React was fast
display in emails
allow/disallow certain tags per field
easier search in HTML fields
simpler data model

Lessons learned

Don't depend on structures coming from places you don't control (like external libraries) - build a translation layer for those.
Isolate external dependencies, so they only have one input and one output - if we didn't have the centralized HTMLInput and backend renderer, we would have had a lot of more issues.
When adding a core component like a WYSIWYG Editor, which you expect to be widely used in your system - spend extra time to think through how its use is going to be expanded.
Not being able to update a dependency sign that you haven't isolated this dependency enough.

Conclusion

I think our whole team and I learned a lot from this blunder. I hope it won't take us so much time to realize and course correct when we make such mistakes in the future.