Are you a user, operator, developer, engineer, or simply someone with interesting user stories to tell about RabbitMQ? If so, we have some exciting news for you! The RabbitMQ Summit 2023 is just around the corner, and we are thrilled to invite you to submit your talks for this highly anticipated event.
The RabbitMQ Summit brings together a vibrant, diverse community of enthusiasts from all corners of the world. It’s a unique opportunity to connect, learn, and exchange ideas with fellow RabbitMQ users, experts, and developers.
This year, we have curated two exciting themes that encompass the breadth and depth of RabbitMQ usage and its impact across industries. We invite you to consider these themes when submitting your proposals:
1. Features Theme- in-depth tech explanation of RabbitMQ features, e.g.: How to use RabbitMQ in an economical fashion explained by developers, a roadmap of how to use streams.
2. Industry Theme- how RabbitMQ is used in the industry, e.g.: In an IoT space, troubleshooting, etc.
Think creatively and propose talks that bring unique perspectives, cutting-edge techniques, and actionable insights to the table. Don’t hesitate to submit your ideas– every story counts.
Event date and location
20 Oct 2023 at theEstrel Hotel- Berlin, Germany
Submission deadlines
All talks should be submitted by 23:59 BST on 11 June 2023. You will be informed about your submission status by 31 July 2023.
There is no event fee for speakers. For just attendees, The public Early Bird tickets will be available on 15th June 2023. You can sign up for the waiting listhere
For more information
You can visit the RabbitMQ Summit website for more details and follow our Twitter account for the latest updates.
Welcome to the XMPP Newsletter, great to have you here again! This issue covers the month of May 2023.
Many thanks to all our readers and all contributors!
Like this newsletter, many projects and their efforts in the XMPP community are a result of people’s voluntary work. If you are happy with the services and software you may be using, please consider saying thanks or help these projects! Interested in supporting the Newsletter team? Read more at the bottom.
XSF Announcements
If you are interested to become an XSF member you can apply during Q3 2023 soon.
Berlin XMPP Meetup (remote): monthly meeting of XMPP enthusiasts in Berlin, every 2nd Wednesday of the month
FOSSY will have an XMPP track at their conference this summer from July 13-16th 2023
XMPP Italian happy hour: monthly Italian XMPP web meeting, starting May 16th and then every third Tuesday of the month at 7:00 PM (online event, with web meeting mode and live streaming).
Talks
The developer behind Libervia has scheduled a couple of informative talks in Paris for June. These presentations will delve into various aspects of the Libervia project and XMPP. The first talk is a 20-minute session in English, taking place at 15:00 on Thursday, June 15 at OW2. The second talk will be a more in-depth 60-minute discussion in French, scheduled for 17:30 on Friday, June 16 at Pas Sage En Seine. These sessions provide a great opportunity to gain insights into the Libervia project and to interact with its developer.
Gajim 1.8.0 has been released and it comes with integrated OMEMO encryption! Integrating the OMEMO plugin brings tighter integration and better user experience. The chat menu has been rearranged and some quick buttons have been added for convenience. Both Gajim’s message search and conversation view received some important changes and fixes.
Gajim’s chat banner in a group chat
Kaidan 0.9 and 0.9.1 have been released! It adds support for OMEMO 2, Automatic Trust Management (ATM), XMPP Providers, Message Reactions and much more.
Kaidan’s chat view
Exciting developments are on the horizon for Libervia. Thanks to a grant from NLnet through the NGI Assure Fund, work on the implementation of A/V calls with Jingle across several frontends is underway. This new feature aims to support both one-on-one and multi-party calls, with plans to even add remote desktop control capabilities. Additionally, the ActivityPub Gateway is currently being stabilised, which will further enhance the functionality of Libervia. For a full rundown of these updates and more, check out the latest progress note on Goffi’s blog.
JMP lets you send and receive text and picture messages (and calls) through a real phone number right from your computer, tablet, phone, or anything else that has a Jabber client. Jabber-side reactions are now translated where possible into the tapback pseudo-syntax. Read more in the JMP blog. Cheogram Android 2.12.1-6 features per-account colours and quiet hours, thumbhash previews for images, and many bug fixes.
omemo-dr, a new OMEMO crypto library is available. omemo-dr is a fork of python-axolotl, which is the crypto library used for OMEMO encryption in Gajim. In preparation for future changes (e.g. the next OMEMO version), Gajim developers forked this library.
The XMPP Standards Foundation develops extensions to XMPP in its XEP series in addition to XMPP RFCs.
Developers and other standards experts from around the world collaborate on these extensions, developing new specifications for emerging practices, and refining existing ways of doing things. Proposed by anybody, the particularly successful ones end up as Final or Active - depending on their type - while others are carefully archived as Deferred. This life cycle is described in XEP-0001, which contains the formal and canonical definitions for the types, states, and processes. Read more about the standards process. Communication around Standards and Extensions happens in the Standards Mailing List (online archive).
Proposed
The XEP development process starts by writing up an idea and submitting it to the XMPP Editor. Within two weeks, the Council decides whether to accept this proposal as an Experimental XEP.
This specification describes a generic method whereby content in messages can be tagged as having a specific Internet Content Type. It also provides a method for sending the same content using different content types, as a fall-back mechanism when communicating between clients having different content type support.
This document defines XMPP application categories for different use cases (Core, Web, IM, and Mobile), and specifies the required XEPs that client and server software needs to implement for compliance with the use cases.
Deferred
If an experimental XEP is not updated for more than twelve months, it will be moved off Experimental to Deferred. If there is another update, it will put the XEP back onto Experimental.
Clarify archive metadata response in the case of an empty archive.
Clarify query response in the case of no matching results. (mw)
Last Call
Last calls are issued once everyone seems satisfied with the current XEP status. After the Council decides whether the XEP seems ready, the XMPP Editor issues a Last Call for comments. The feedback gathered during the Last Call can help improve the XEP before returning it to the Council for advancement to Stable.
No Last Call this month.
Stable
No XEP moved to stable this month.
Deprecated
No XEP deprecated this month.
Call for Experience
A Call For Experience - like a Last Call, is an explicit call for comments, but in this case it’s mostly directed at people who’ve implemented, and ideally deployed, the specification. The Council then votes to move it to Final.
Looking for job offers or want to hire a professional consultant for your XMPP project? Visit our XMPP job board.
Newsletter Contributors & Translations
This is a community effort, and we would like to thank translators for their contributions. Volunteers are welcome! Translations of the XMPP Newsletter will be released here (with some delay):
This XMPP Newsletter is produced collaboratively by the XMPP community. Each month’s newsletter issue is drafted in this simple pad. At the end of each month, the pad’s content is merged into the XSF Github repository. We are always happy to welcome contributors. Do not hesitate to join the discussion in our Comm-Team group chat (MUC) and thereby help us sustain this as a community effort. You have a project and want to spread the news? Please consider sharing your news or events here, and promote it to a large audience.
Tasks we do on a regular basis:
gathering news in the XMPP universe
short summaries of news and events
summary of the monthly communication on extensions (XEPs)
I have been working as an Elixir developer for quite some time and recently came across the ChatGPT model. I want to share some of my experience interacting with it.
During my leisure hours, I am developing an open-source Elixir initiative, Crawly, that facilitates the extraction of structured data from the internet.
Here I want to demonstrate how ChatBot helped me to improve the code. Let me show you some prompts I have made and some outputs I’ve got back.
Writing documentation for the function
I like documentation! It’s so nice when the code you’re working with is documented! What I don’t like — is writing it. It’s hard, and it’s even harder to keep it updated.
So let’s consider the following example from our codebase:
My prompt:
Reasonably good for my taste. Here how it looks like after a few clarifications:
Improving my functions
Let’s look at other examples of the code I have created! Here is an example of code I am not super proud of. Please don’t throw stones at me; I know I should not create atoms dynamically. Well, you know how it can be- you want to have something, but you don’t have enough time with idea to improve things later and that later never happens. I think every human developer knows that story.
Suggesting tests for the code
Writing tests is a must-have in the modern world. TDD makes software development better. As a human, I like to have tests. However, I find it challenging to write them. Can we use the machine to do that? Let’s see!
Regarding tests, the response code needs to be more accurate. It might be possible to improve results by feeding chatBot with Elixir test examples, but these tests are good as hints.
Creating Swagger documentation from JSON Schema
I like using JSON schema for validating HTTP requests. The question is — is that possible to convert a given JSON schema into a Swagger description so we don’t have to do double work?
Let’s see. Fortunately, I have a bit of JSON schema-type data in Crawly. Let’s see if we can convert it into Swagger.
Well, it turns out it’s not a full Swagger format; which is a bit of a shame. Can we improve it a bit so it can be used immediately? Yes, let’s prompt engineer it a bit:
Now I can preview the result in the real Swagger editor, and I think I like it:
Conclusion
What can I say? ChatGPT is a perfect assistant that makes development way less boring for regular people.
Using ChatGPT when programming with Elixir can bring several advantages. One of the most significant advantages is that it can provide quick and accurate responses to various programming queries, including syntax and documentation. This can help programmers save time and improve their productivity.
Additionally, ChatGPT can offer personalised and adaptive learning experiences based on individual programmers’ skill levels and preferences. This can help programmers learn Elixir more efficiently and effectively.
The use of AI and NLP in programming is likely to become more widespread in the near future. As AI technology continues to improve, it may become more integrated into the programming process, helping programmers to automate repetitive tasks and improve their overall productivity. ChatGPT is just one example of how AI is already being used to improve the programming experience, and more innovations in this field will likely emerge in the future.
Honestly, I am super excited and would love to explore the possibilities of AI even further!
Gajim 1.8.0 comes with integrated OMEMO encryption! Integrating the OMEMO plugin brings tighter integration and better user experience. We also rearranged the chat menu and added some quick buttons for convenience. Both Gajim’s message search and conversation view received some important changes and fixes. Thank you for all your contributions!
What’s New
In the past, we moved the most popular plugins into Gajim’s core: image preview, plugin installer, HTTP file upload, syntax highlight, and now OMEMO encryption as well. This means: troubles with installing/upgrading the plugin or finding a packaged version of it are no more 🎉
In preparation for future changes (e.g. the next OMEMO version), we forked python-axolotl, which is the crypto library used for OMEMO encryption in Gajim. Gajim’s fork of this library is called omemo-dr (double ratchet, an encryption mechanism) and you can track its progress here.
OMEMO Logo, by fiaxh - https://github.com/siacs/Conversations/blob/master/art/omemo_logo.svg, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=46134840
When you start Gajim 1.8.0 you will notice some changes in the chat window: the chat menu moved to the top, and some of its actions are now available via quick action buttons (contact details, sharing infos, and message search). In group chats, you can now search for participants and directly invite someone. Another change concerns group chats: you can now see from the chat list if Gajim is still joining or fetching messages (or if there is an error). When searching for messages, the search box is now displayed to the side of the chat, instead of overlapping it.
Gajim’s new quick actions and improved participants list
What else changed:
Fixed an annoying issue where Gajim would not load newer messages when jumping to the past
Audio preview for more file types and more image previews on Windows
Europe and the US are leading the way in the forecasted recession for 2023, due to persistently high inflation and increasing interest rates. With minimal projected GDP growth, modern technologies can play a crucial role in reducing the impact of economic downturns.
As caution looms, it can be tempting to reign in on your investment. Your initial thought is to balance the books and save money where you can which, on the surface, sounds like the sensible thing to do.
But consider this. Investing in technology, specifically RabbitMQ in the face of recession can actually reduce the cost of doing business and save you money over time. In fact, leading digital organisations that have prioritised this has a lower cost of doing business and create a significant competitive edge in the current inflationary environment.
Carving out a tech budget can signify that you’re conscious of the current challenges while being hopeful for the future. Being mindful of the long game can allow businesses to thrive as well as survive.
So if you really want to recession-proof your business? We’ll tell you why it’s best to invest.
Executives are already planning tech investments
In a survey complied by Qualtrics, a majority of C-suite level executives (85%) expect spending to increase this year, as their companies go through both a workforce and digital transformation, with tech priorities as follows:
Tech modernisation (73%)
Cybersecurity (73%)
Staffing/retaining workforce (72%)
Training/reskilling workforce (70%)
Remote work support (70%)
This is a major shift from previous cycles we’ve seen. In the past, tech investments have been one of the first on the chopping block. But now, businesses have taken note that investing in technology is not a cost, but a business driver and differentiator.
The game of business survival has changed
Prior to the pandemic in 2020, digital capabilities were not considered a high priority by businesses when considering preparations for a potential economic downturn. Fast forward to today, and we can see that COVID-19 has expedited digital change for businesses.
Companies have accelerated the digitisation of customer and supply-chain interactions by three to four years, and a sudden change in consumer habits has steered the ship.
Consumers made a drastic shift towards online channels, which meant that industries had to ramp up their digital offerings fast. Rates of adoption during this time were lightyears of ahead when compared to previous years (as shown above). Fast forward to today and 48% of consumers say their buying habits have permanently changed since the pandemic. Businesses have to maintain their digital presence, in order to serve a growing market.
Adapting to demand
Technologies such as RabbitMQ have been integral to support the high volume of incoming requests and efficiently accommodate digital acceleration.
Consider this- your website might be selling tickets to an event, and you’re anticipating a record number of visitors. Your main consideration is ensuring your satisfying customer wait times. So what’s the solution? A faster operating system.
RabbitMQ technology helps with customer wait times through its ability to handle high volumes of incoming requests. These requests are distributed to various parts of the system, ensuring that each one is processed promptly. By effectively handling the increased load, RabbitMQ helps businesses maintain shorter wait times, accommodating increased or decreased message volumes with ease, as business demands fluctuate during a recession.
Sustainability is steering the tech agenda
Do you think businesses are slowing down on ESG during a recession? Think again.
A recent study revealed nearly half of CFOs plan to increase investment in ESG initiatives this year despite high inflation, ongoing supply chain challenges and the risk of recession.
According to the World Economic Forum, 90% of executives believe sustainability is important. However, only 60% of organisations have sustainability strategies.
Addressing ESG initiatives during a recession requires a thoughtful approach from C-Suite leaders and should be apparent across your tech offerings. While not all software is inherently sustainable, tools like RabbitMQ can support sustainable practices.
For example, organisations can further enhance the sustainability of their messaging infrastructure by running RabbitMQ on energy-efficient hardware, leveraging cloud services with renewable energy commitments, or optimising their message routing algorithms to minimise resource usage. These additional considerations can contribute to a more sustainable use of RabbitMQ as part of a broader sustainability framework within a business.
Providing professional, reliable, and cutting-edge products and services with ESG values can bring meaningful change to people’s lives. And building social responsibility is critical to support the entire ecosystem.
Tech built for business operations
IT was once run by very centralised systems. But today, most software used by businesses is outsourced- meaning all the services are disjointed and require training to understand how to manage each one individually, which can be costly.
SOA stands for Service-Oriented Architecture. In relation to RabbitMQ, SOA refers to a design approach where software applications are built as a collection of individual services that communicate with each other to perform specific tasks. RabbitMQ can play a role in facilitating communication between these services.
Imagine your company’s different departments- sales, inventory, shipping, HR etc. Each department has its own software application that handles specific tasks. In a traditional architecture, these applications might be tightly coupled, meaning they directly interact with each other.
With a service-oriented architecture, the applications are designed as separate services. Each service focuses on a specific function, like processing sales orders, managing inventory, or tracking shipments. These services can communicate with each other by sending messages.
RabbitMQ acts as a messenger between these services. When one service needs to send information or request something from another service, it can publish a message to RabbitMQ. The receiving service listens and responds accordingly. This decoupled approach allows services to interact without having direct knowledge of each other, promoting flexibility and scalability.
For example, when a sales application receives a new order, it can publish a message to RabbitMQ containing the order details. The inventory service, subscribed to the relevant message queue, receives the order information, updates the inventory, and sends a confirmation back through RabbitMQ. The shipping service can then listen for shipping requests and initiate the shipping process.
During a recession, SOA with RabbitMQ enables a more modular and flexible system, where services can be developed, deployed, and scaled independently. It simplifies communication between different components, promotes loose coupling, and allows for efficient integration and the ability to quickly adapt to changing market conditions of a recession.
Tech supports recession-proofing goals
Investment in innovative digital initiatives is indispensable for a constantly evolving digital transformation journey, especially during market shifts. Programmes such as RabbitMQ provide organisations with the flexibility required to swiftly shift to new solutions, responding to market changes more quickly.
To conclude, although recessions can be intimidating, it is crucial for businesses to embrace technology as a means to prepare themselves and ensure a positive customer experience. Leveraging technological solutions allows businesses to stay resilient, adapt to changing market dynamics, and position themselves for long-term success, even in challenging economic times.
If you’d like to talk about your current tech space, feel free to drop us a line.
We’ve had an important security issue reported that affects all recent versions of Openfire. We’ve fixed it in the newly published 4.6.8 and 4.7.5 releases. We recommend people upgrade as soon as possible. More info, including mitigations for those who cannot upgrade quickly, is available in this security advisory: CVE-2023-32315: Administration Console authentication bypass.
Related to this issue, we have also made available updates to three of our plugins:
If you’re using these plugins, it is recommended to update them immediately.
When you are using the REST API plugin, or any proprietary plugins, updating Openfire might affect availability of their functionality. Please find work-arounds in the security advisory.
If you have any questions, please stop by our community forum or our live groupchat. We are always looking for volunteers interested in helping out with Openfire development!
For other release announcements and news follow us on Twitter and Mastodon.
The Ignite Realtime Community is happy to announce the 4.6.8 release of Openfire!
We have made available a new release of this older version to addresses the issue that is subject of security advisory CVE-2023-32315.
We are aware that for some, the process of deploying a new major version of Openfire is not a trivial matter, as it may encompass a lot more than only performing the update of the executables. Depending on regulations that are in place, this process can require a lot of effort and take a long time to complete. To facilitate users that currently use an older version of Openfire, we are also making available this new release in the older 4.6 branch of Openfire. An upgrade to this version will, for some, require a lot less effort. Note well: although we are making available a new version in the 4.6 branch of Openfire, we strongly recommend that you upgrade to the latest version of Openfire (currently in the 4.7 branch), as that includes important fixes and improvements that are not available in 4.6.
You can find download artifacts available here with the following sha256sum values
If you have any questions, please stop by our community forum or our live groupchat. We are always looking for volunteers interested in helping out with Openfire development!
For other release announcements and news follow us on Twitter and Mastodon.
Bienvenidos al segundo capítulo de la serie “Elixir, 7 pasos para iniciar tu viaje”.
En el primer capítulo hablamos sobre la máquina virtual de Erlang, la BEAM, y las características que Elixir aprovecha de ella para desarrollar sistemas que son:
Concurrentes
Tolerantes a fallos
Escalables y
Distribuidos
En esta nota explicaremos qué significa la concurrencia para Elixir y Erlang y por qué es importante para desarrollar sistemas tolerantes a fallos. Al final encontrarás un pequeño ejemplo de código hecho con Elixir para que puedas observar las ventajas de la concurrencia en acción.
Concurrencia
La concurrencia es la habilidad para llevar a cabo dos o más tareas
aparentemente al mismo tiempo .
Para entender por qué la palabra aparentemente está resaltada, veamos el siguiente caso:
Una persona tiene que completar dos actividades, la tarea A y la tarea B
Inicia la tarea A, avanza un poco y la pausa.
Inicia la tarea B, avanza un poco, la pausa y continúa con la tarea A.
Avanza un poco con la tarea A, la pausa y continúa con la tarea B.
Y así va avanzando con cada una, hasta terminar ambas actividades.
No es que la tarea A y la tarea B se lleven a cabo exactamente al mismo tiempo, más bien la persona dedica un tiempo a cada una y va intercambiándose entre ellas. Estos tiempos pueden ser tan cortos que el cambio es imperceptible para nosotros, por eso se produce la ilusión de que las actividades están sucediendo simultáneamente.
Paralelismo
Hasta ahora no había mencionado nada sobre paralelismo porque no es un concepto fundamental en la BEAM o para Elixir. Pero recuerdo que cuando estaba aprendiendo a programar se me dificultó comprender la diferencia entre paralelismo y concurrencia, así que aprovecharé esta nota para compartirte una breve explicación.
Sigamos con el ejemplo anterior. Si ahora traemos a otra persona para completar las tareas y ambas trabajan al mismo tiempo, hablamos de paralelismo.
De manera que podríamos tener a dos o más personas trabajando paralelamente, cada una llevando a cabo sus actividades concurrentemente. Es decir, la concurrencia puede ser o no paralela.
En Elixir la concurrencia se logra gracias a los procesos de Erlang, que son creados y administrados por la BEAM.
Procesos
En Elixir todo el código se ejecuta dentro de procesos. Y una aplicación puede tener cientos o miles de ellos ejecutándose de manera concurrente.
¿Cómo funciona?
Cuando la BEAM se ejecuta en una máquina, se encarga de crear por default un hiloen cada procesador disponible. En ese hilo existe una cola dedicada a tareas específicas, y cada cola tiene a su vez un administrador (scheduler) que es responsable de asignar un tiempo y una prioridad a las tareas.
Entonces, en una máquina multicore con dos procesadores puedes tener dos hilos y dos schedulers, lo que te permite paralelizar las tareas al máximo. También puedes ajustar la configuración de la BEAM para indicarle qué procesadores utilizar.
En cuanto a las tareas, cada una se ejecuta en un proceso aislado.
Parece algo simple, pero justamente esta idea es la magia detrás de la escalabilidad, distribución y tolerancia a fallos de un sistema hecho con Elixir.
Veamos este último concepto para entender por qué.
Tolerancia a fallos
La tolerancia a fallos de un sistema se refiere a la capacidad que tiene para manejar los errores y no morir en el intento. El objetivo es que ninguna falla, sin importar lo crítica que sea, inhabilite o bloquee el sistema. Esto se logra nuevamente gracias a los procesos de Erlang.
Los procesos son elementos aislados, que no comparten memoria y se comunican mediante paso de mensajes.
Lo anterior significa que si algo falla en el proceso A, el proceso B no se ve afectado, es más, es posible que ni siquiera se entere. El sistema seguirá funcionando con normalidad mientras la falla se arregla tras bambalinas. Y si a esto sumamos que la BEAM también nos proporciona por defecto mecanismos para detección y recuperación de errores podemos garantizar que el sistema funcione de manera ininterrumpida.
Revisemos un ejemplo de cómo crear procesos que se ejecutan de manera concurrente en Elixir. Lo vamos a contrastar con el mismo ejercicio ejecutándose de manera secuencial.
¿Listo? No te preocupes si no entiendes algo de la sintaxis, en general el lenguaje es muy intuitivo, pero el objetivo por ahora es que seas testigo de la magia de la concurrencia en acción.
El primer paso consiste en crear los procesos.
Spawn
Hay diferentes formas de crear procesos en Elixir. A medida que vayas avanzando encontrarás maneras más sofisticadas de hacerlo, aquí utilizaremos la básica: la función spawn. ¡Manos a la obra!
Tenemos 10 registros que corresponden a la información de usuarios que vamos a insertar en una base de datos, pero antes queremos validar que el nombre no contenga caracteres raros y que el email tenga un @.
Supongamos que la validación de cada usuario tarda en total 2 segundos.
Abre un editor de texto y copia el siguiente código. Guárdalo en un archivo llamado procesos.ex
defmodule Procesos do
# Vamos a utilizar expresiones regulares para el formato del nombre y
# el correo electrónico
@email_valido ~r/^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$/
@nombre_valido ~r/\b([A-ZÀ-ÿ][-,a-z. ']+[ ]*)+/
# Se tiene una lista de usuarios con un nombre y correo electrónico.
# La función validar_usuarios_X manda a llamar a otra función:
# validar_usuario, que revisa el formato del correo e imprime un
# mensaje de ok o error para cada registro
# Esta función trabaja SECUENCIALMENTE
def validar_usuarios_secuencialmente() do
IO.puts("Validando usuarios secuencialmente...")
usuarios = crear_usuarios()
Enum.each(usuarios, fn elem ->
validar_usuario(elem) end)
end
# Esta función trabaja CONCURRENTEMENTE, utilizando spawn
def validar_usuarios_concurrentemente() do
IO.puts("Validando usuarios concurrentemente...")
usuarios = crear_usuarios()
Enum.each(usuarios, fn elem ->
spawn(fn -> validar_usuario(elem) end)
end)
end
def validar_usuario(usuario) do
usuario
|> validar_email()
|> validar_nombre()
|> imprimir_estatus()
# Esto hace una pausa de 2 segundos para simular que el proceso inserta # los registros en base de datos
Process.sleep(2000)
end
# Esta función recibe un usuario, valida el formato del correo y le
# agrega la llave email_valido con el resultado.
def validar_email(usuario) do
if Regex.match?(@email_valido, usuario.email) do
Map.put(usuario, :email_valido, true)
else
Map.put(usuario, :email_valido, false)
end
end
# Esta función recibe un usuario, valida su nombre y le agrega la
# llave nombre_valido con el resultado.
def validar_nombre(usuario) do
if Regex.match?(@nombre_valido, usuario.nombre) do
Map.put(usuario, :nombre_valido, true)
else
Map.put(usuario, :nombre_valido, false)
end
end
# Esta función recibe un usuario que ya pasó por la validación
# de email y nombre y dependiendo de su resultado, imprime el
# mensaje correspondiente al estatus.
def imprimir_estatus(%{
id: id,
nombre: nombre,
email: email,
email_valido: email_valido,
nombre_valido: nombre_valido
}) do
cond do
email_valido && nombre_valido ->
IO.puts("Usuario #{id} | #{nombre} | #{email} ... es válido")
email_valido && !nombre_valido ->
IO.puts("Usuario #{id} | #{nombre} | #{email} ... tiene un nombre inválido")
!email_valido && nombre_valido ->
IO.puts("Usuario #{id} | #{nombre} | #{email} ... tiene un email inválido")
!email_valido && !nombre_valido ->
IO.puts("Usuario #{id} | #{nombre} | #{email} ... es inválido")
end
end
defp crear_usuarios do
[
%{id: 1, nombre: "Melanie C.", email: "melaniec@test.com"},
%{id: 2, nombre: "Victoria Beckham", email: "victoriab@testcom"},
%{id: 3, nombre: "Geri Halliwell", email: "gerih@test.com"},
%{id: 4, nombre: "123456788", email: "melb@test.com"},
%{id: 5, nombre: "Emma Bunton", email: "emmab@test.com"},
%{id: 6, nombre: "Nick Carter", email: "nickc@test.com"},
%{id: 7, nombre: "Howie Dorough", email: "howie.dorough"},
%{id: 8, nombre: "", email: "ajmclean@test.com"},
%{id: 9, nombre: "341AN L1ttr377", email: "Brian-Littrell"},
%{id: 10, nombre: "Kevin Richardson", email: "kevinr@test.com"}
]
end
end
2. Abre una terminal, escribe iex y compila el archivo que acabamos de crear.
$ iex
Erlang/OTP 25 [erts-13.1.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit:ns]
Interactive Elixir (1.14.0) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> c("procesos.ex")
[Procesos]
3. Una vez que hayas hecho esto, manda a llamar la función que valida los registros secuencialmente. Tomará un poco de tiempo, ya que cada registro tarda 2 segundos.
iex(2)> Procesos.validar_usuarios_secuencialmente
4. Ahora manda a llamar la función que valida los registros concurrentemente y observa la diferencia en tiempos.
Es bastante notoria, ¿no crees? Esto se debe a que en el paso 3, con la evaluación secuencial, cada proceso tiene que esperar a que el anterior termine. En cambio, la ejecución concurrente crea procesos que funcionan aisladamente; por lo tanto, ninguno depende del anterior ni está bloqueado por ninguna otra tarea.
¡Imagina la diferencia cuando se trata de miles o millones de tareas en un sistema!
La concurrencia es la base para las otras características que mencionamos al inicio: distribución, escalabilidad y tolerancia a fallos. Gracias a la BEAM, se vuelve relativamente fácil implementarla en Elixir y aprovechar las ventajas que nos brinda.
Ahora, ya conoces más sobre procesos y concurrencia, especialmente sobre la importancia de este aspecto para crear sistemas altamente confiables y tolerantes a fallas. Recuerda practicar y volver a esta nota cuando lo necesites.
Siguiente capítulo…
En la siguiente nota hablaremos de las bibliotecas, frameworks y todos los recursos que existen alrededor de Elixir. Te sorprenderá lo fácil y rápido que es crear un proyecto desde cero y verlo funcionando.
Welcome to the second chapter of the “Elixir, 7 steps to start your journey” series.
In the first chapter, we talk about the Erlang Virtual Machine, the BEAM, and the characteristics that Elixir takes advantage of to develop systems that are:
Concurrent
Fault-tolerant
Scalable and
Distributed
In this note, I’ll explain what concurrency means to Elixir and Erlang and why it’s essential for building fault-tolerant systems. You’ll also find a little Elixir code example to see the advantages of concurrency in action.
Concurrency
Concurrency is the ability to perform two or more tasks apparently at the same time.
To understand why the word apparently is highlighted, let’s look at the following case:
A person has to complete two activities, task A and task B.
Starts task A, moves forward a bit, and pauses.
Starts task B, moves forward, pauses, and continues with task A.
Goes ahead with task A, pauses, and continues with task B.
And so it progresses with each one, until finishing both activities.
It is not that task A and task B are carried out at precisely the same time; instead, the person spends time on each one and interchanges between them. But these times can be so short that the change is invisible to us, so the illusion is produced that the activities are happening simultaneously.
Parallelism
So far, I haven’t mentioned anything about parallelism because it’s not a fundamental concept in the BEAM or for Elixir. But I remember that when I was learning to program, it took me a while to understand the difference between parallelism and concurrency, so I took advantage of this note to explain briefly.
Let’s continue with the previous example. If we now bring in another person to complete the tasks and they both work at the same time, we now achieve parallelism.
So, we could have two or more people working in parallel, each carrying out their activities concurrently. That is, the concurrency may or may not be parallel.
In Elixir, concurrency is achieved thanks to Erlang processes, which are created and managed by the BEAM.
Processes
In Elixir all code runs inside processes. And an application can have hundreds or thousands of them running concurrently.
How does it work?
When the BEAM runs on a machine, it creates a thread on each available processor by default. In this thread, there is a queue dedicated to specific tasks, and each queue has a scheduler responsible for assigning a time and a priority to the tasks.
So, on a multicore machine with two processors, you can have two threads and two schedulers, allowing you to parallelize tasks as much as possible. You can also adjust BEAM’s settings to indicate which processors to use.
As for the tasks, each one is executed in an isolated process.
It seems simple, but precisely this idea is the magic behind the scalability, distribution, and fault tolerance of a system built with Elixir.
Let’s go deep into this last concept to understand why.
Fault-tolerance
The fault tolerance of a system refers to its ability to handle errors. The goal is that no failure, no matter how critical, disables or blocks the system and this is again achieved thanks to Erlang processes.
The processes are isolated elements that do not share memory and communicate through message passing.
This means that if something goes wrong in process A, process B is unaffected. It may not even know about it. The system will continue functioning normally while the fault is fixed behind the scenes. And if we add that the BEAM also provides default mechanisms for error detection and recovery, we can guarantee that the system works uninterruptedly.
Let’s review an example of creating processes that run concurrently in Elixir. We are going to contrast it with the same exercise running sequentially.
Ready? Don’t worry if you don’t understand some of the syntax; overall, the language is very intuitive, but the goal is to witness the magic of concurrency in action.
The first step is to create the processes.
Spawn
There are different ways to create processes in Elixir. As you progress, you will find more forms to do it; here, we will use the basic one: the spawn function. Let’s do it!!
We have ten records that correspond to the user’s information that we will insert into a database, but first, we want to validate that the name does not contain random characters and that the email has @.
Suppose each user validation takes a total of 2 seconds.
In your favorite text editor copy the following code. Save it in a file called processes.ex
defmodule Processes do
# We are going to use regular expressions for the name format and the
# email
@valid_email ~r/^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$/
@valid_name ~r/\b([A-ZÀ-ÿ][-,a-z. ']+[ ]*)+/
# There is a list of users with a name and email.
# The function validate_users_X calls another function:
# validate_user, which checks the format of the email and prints an
# ok or error message for each record
# This function works SEQUENTIALLY
def validate_users_sequentially() do
IO.puts("Validating users sequentially...")
users = create_users()
Enum.each(users, fn elem ->
validate_user(elem) end)
end
# This function works CONCURRENTLY, with spawn
def validate_users_concurrently() do
IO.puts("Validating users concurrently...")
users = create_users()
Enum.each(users, fn elem ->
spawn(fn -> validate_user(elem) end)
end)
end
def validate_user(user) do
user
|> validate_email()
|> validate_name()
|> print_status()
# This pauses for 2 seconds to simulate the process inserting
# the records into the database
Process.sleep(2000)
end
# This function receives a user, validates the format of the email and
# add the valid_email key to the result.
def validate_email(user) do
if Regex.match?(@valid_email, user.email) do
Map.put(user, :valid_email, true)
else
Map.put(user, :valid_email, false)
end
end
# This function receives a user, validates the format of the name and
# add the valid_name key to the result.
def validate_name(user) do
if Regex.match?(@valid_name, user.name) do
Map.put(user, :valid_name, true)
else
Map.put(user, :valid_name, false)
end
end
# This function receives a user that has already gone through
# validation email and name and depending on its result, prints
# the message corresponding to the status.
def print_status(%{
id: id,
name: name,
email: email,
valid_email: valid_email,
valid_name: valid_name
}) do
cond do
valid_email && valid_name ->
IO.puts("User #{id} | #{name} | #{email} ... is valid")
valid_email && !valid_name ->
IO.puts("User #{id} | #{name} | #{email} ... has an invalid name")
!valid_email && valid_name ->
IO.puts("User #{id} | #{name} | #{email} ... has an invalid email")
!valid_email && !valid_name ->
IO.puts("User #{id} | #{name} | #{email} ... is invalid")
end
end
defp create_users do
[
%{id: 1, name: "Melanie C.", email: "melaniec@test.com"},
%{id: 2, name: "Victoria Beckham", email: "victoriab@testcom"},
%{id: 3, name: "Geri Halliwell", email: "gerih@test.com"},
%{id: 4, name: "123456788", email: "melb@test.com"},
%{id: 5, name: "Emma Bunton", email: "emmab@test.com"},
%{id: 6, name: "Nick Carter", email: "nickc@test.com"},
%{id: 7, name: "Howie Dorough", email: "howie.dorough"},
%{id: 8, name: "", email: "ajmclean@test.com"},
%{id: 9, name: "341AN L1ttr377", email: "Brian-Littrell"},
%{id: 10, name: "Kevin Richardson", email: "kevinr@test.com"}
]
end
end
2. Open a terminal, type iex and compile the file we just created.
$ iex
Erlang/OTP 25 [erts-13.1.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit:ns]
Interactive Elixir (1.14.0) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> c("processes.ex")
[Processes]
3. Once you’ve done this, call the function that validates the records sequentially. Remember that it will take a little time since each record takes 2 seconds.
iex(2)> Processes.validate_users_sequentially
4. Now call the function that validates the records concurrently and observe the difference in times.
iex(3)> Processes.validate_users_concurrently
It’s pretty noticeable, don’t you think? This is because in step 3, with sequential evaluation, each process has to wait for the previous one to finish. Instead, concurrent execution creates processes that run in isolation; therefore, neither depends on the former nor does any other task block it.
Imagine the difference with thousands or millions of tasks in a system!
Concurrency is the foundation for the other features we mentioned: distribution, scalability, and fault tolerance. Thanks to the BEAM, implementing it in Elixir and taking advantage of it becomes relatively easy.
Now, you know more about processes and concurrency, especially about the importance of this aspect in building highly reliable and fault-tolerant systems. Remember to practice and come back to this note when you need to.
Next Chapter
In the next note, we will talk about the libraries, frameworks, and all the resources that exist around Elixir. You will be surprised how easy and fast it is to create a project from scratch and see it working.
The below is a list of the new capabilities brought to our Messaging products for the 19.0 release. 19.0 adds a lot of extra functionality across the board for our messaging products, along with a complete rewrite of the codebase so that future releases and bug fixes can be developed more quickly. For the full release notes please check the individual product updates, available from the customer portal and evaluation sections of our website.
Dependencies
Cobalt (version 1.3 or later) is needed to manage various capabilities in M-Switch 19.0.
M-Switch, M-Store and M-Box depend on M-Vault 19.0. All of these products are a part of R19.0 with common libraries and so are commonly installed together.
Product Activation
All of the messaging products now use the new product activation. Products activation is managed with the Messaging Activation Server (MAS) which provides a Web interface to facilitate managing activation of messaging and other Isode products. MAS is provided as a tool, but installed as an independent component.
M-Switch
Product Activation
There are a number of M-Switch features arising from the new product activation:
Various product options are encoded in the activation, restricting functionality to M-Switch options purchased. The options available and any activation time limits are displayed by MConsole.
MConsole will correctly display the product name of the M-Switch being used (e.g., M-Switch MIXER, M-Switch Gateway etc).
MConsole views are restricted so that only ones relevant to the activated options are shown (e.g,, ACP 127 views will not be shown unless ACP 127 is activated).
Use of Cobalt
A number of functions have been moved from MConsole to Cobalt, which provides a Web general administrator interface. MConsole is being more focused on M-Switch server configuration and operation. Capabilities provided by Cobalt in support of M-Switch:
User and Role provisioning (replacing Internet Mail View)
Special function mailboxes
Redirections
Standard SMTP distribution lists
Military Distribution Lists
Profiler Configuration
File Transfer by Email (FTBE) account provisioning
Directory and Authentication
A number of enhancements have been made to improve security of authentication. New configurations will require this improved security and upgrades are expected to switch.
Configuration of default M-Vault configuration directory is simplified.
Option provided to use a different M-Vault directory for users/operators, defaulting to the configuration directory.
M-Switch access to configuration and user directories will always authenticate using SASL SCRAM-SHA-1. This is particularly important for deployments not using TLS, as it will ensure plain passwords are not sent over a link, while still using hashed passwords in M-Vault.
M-Vault directories created by MConsole will always have TLS enabled (where the product activation option allows).
Connections from M-Switch to M-Vault will use TLS by default.
Three modes can be configured for SMTP and SOM (MConsole) access to M-Switch
SCRAM-SHA-1. This is the default and is a secure option suitable for most configurations.
PLAIN. This option is needed if authentication is done using pass through to Active directory. This should only be used on systems with TLS.
ANY. When this option is used, SOM/MConsole will use SCRAM-SHA-1. It is needed for SMTP setups that want to offer additional SASL mechanisms such as CRAM-MD5, which will need plain passwords to be stored in M-Vault.
ACP 127
An extensive set of enhancements had been provided to ACP 127.
Extend circuit control from enabled/disable to Enabled (Rx/Tx) / Rx Only / Disabled
Enhanced OPSIG support for BRIPES following agreed doc:
QRT/QRV. Supports remote enable/disable, including control from top level of circuit management UI
ZES2 automatic handling on receive
Service message option to send INT ZBZ
Configurable option for reliable circuit to send ZBZ5 to acknowledge receipt of identified message
Limiting priority UI use two letter codes, but will still recognize single letter
Add CHANNEL CHECK generation and response
Option to use “Y” for emergency messages
Support for Community Variables (CV) which is a BRASS mechanism to use multiple crypto keys
Configuration of CVs available for each destination
Display of CVs for queued messages
CV Audit Logging
Scheduled Broadcasts to support MUs with constrained availability (e.g., Submarines)
Periodic Mode with GUI configuration
UI to show which messages will be transmitted in which period based on estimated transmission times
Scheduled periods at same time each day
Explicitly scheduled fixed intervals on specific day
Extension to Routing Tree configuration to specify specific channel. This makes it easier to utilize the ACP 127 RI routing, which is needed in many ACP 127 configurations
Improved mapping of CAD/AIG to SMTP
Option to turn off message reassembly
Improvements to monitoring of circuits using serial links
FAB (Frequency Assignment Broadcast)
A subsystem is provided to support FAB, which is needed for older BRASS systems that do not support ALE. The M-Switch FAB architecture is described in https://www.isode.com/whitepapers/brass.html. The key points are listed below:
A new FAB Server component is provided to run black side and generate the FAB data stream(s).
Red/Black separation can be provided by M-Guard
The FAB Server can monitor a remote modem for link quality using a new SNR monitoring protocol provided by Icon-5066 3.0.
Circuits to support FAB use a new “anonymous” type, reflecting that they are not associated with a specific peer.
Support is provided for ARQ (STANAG 5066 COSS) circuits which operate automatically shore side and for direct to modem circuits which require a shore side operator.
There is an operator UI for each circuit that enables setting FAB status and controlling acceptance of messages
Profiler and Corrector
Support of TLS for Corrector UI and Manual Profiler
Improved message display, including Security Label
Profile configuration read from directory, which enables Cobalt configuration of Profiler rules
Icon-Topo Support
Isode’s Icon-Topo product automatically updates M-Switch configuration in support of MU Mobility. M-Switch enhancements made in support of this:
Show clearly in MConsole when External MTAs, Routing Tree Entries and Nexus are created by Icon-Topo.
Enhance Nexus and Diversion UI to better display Icon-Topo created information.
Miscellaneous
Configure Warning Time based on Message Priority.
Tool to facilitate log and archive clear out
M-Store
No new features for R19.0.
M-Box
Improved Searching
Message searching is extended with three new capabilities that are exposed in Harrier.
Choice to search based on SIC (Subject Indicator Code) which can be used on its own or in conjunction with options to search other parts of the message.
Option to filter search based on a choice of one or more message precedences, matching against the action or info precedence as appropriate for the logged in user.
Option to filter search based on selected security label.
The below is a list of the new capabilities brought to our Directory products for the 19.0 release. 19.0 adds a lot of extra functionality across the board for our messaging products, along with a complete rewrite of the codebase so that future releases and bug fixes can be developed more quickly. For the full release notes please check the individual product updates, available from the customer portal and evaluation sections of our website.
Dependencies
Use of several new 19.0 features depend on Cobalt 1.3 or later.
M-Vault
Product Activation
M-Vault uses the new product activation. Product activation is managed with the Messaging Activation Server (MAS) which provides a Web interface to facilitate managing activation of messaging and other Isode products. MAS is provided as a tool, but installed as an independent component.
Headless Setup
M-Vault, in conjunction with Cobalt, provides a mechanism to set up a server remotely with a Web interface only. This complements setup on the server using the M-Vault Console GUI.
Password Storage
Password storage format defaults to SCRAM-SHA-1 (hashed). This hash format is preferred as it enables use of SASL SCRAM-SHA-1 authentication which avoids sending plain passwords. Storage of passwords in the plain (previous default) is still allowed but discouraged.
LDAP/AD Passthrough
An LDAP Passthrough mechanism is added so that M-Vault users can be authenticated over LDAP against an entry in another directory. The key target for this mechanism is where there is a need to manage information in M-Vault, but to authenticate users with password against users provisioned in Microsoft Active Directory. This is particularly important for Isode applications such as M-Switch, M-Link, and Harrier which utilize directory information not generally held in Active Directory.
Cobalt provides capabilities to manage accounts utilizing LDAP Passthrough.
OAuth Enhancements
A number of enhancements to OAuth, which was introduced in R18.1
OAUTH service has been integrated into the core M-Vault server, which simplifies configuration and improves security,
Operation without Client Secret, validating OAUTH Client using TLS Client Authentication. This improves security and resilience.
Allow client authentication using Windows SSO, so that Windows SSO can work for OAUTH Clients. This enables SSO to be used for Isode’s applications using OAuth.
Sodium Sync
Some enhancements to Sodium Sync to improve operation on Windows Server.
Option that will improve performance for any remote server with a large round-trip-time.
MongooseIM is a highly customisable instant messaging backend, that can handle millions of messages per minute, exchanged between millions of users from thousands of dynamically configurable XMPP domains. With the new release 6.1.0 it becomes even more cost-efficient, flexible and robust thanks to the new arm64 Docker containers and the C2S process rework.
Arm64 Docker containers
Modern applications are often deployed in Docker containers. This solution simplifies deployment to cloud-based environments, such as Amazon Web Services (AWS) and Google Cloud. We believe this is a great choice for MongooseIM, and we also support Kubernetes by providing Helm Charts. Docker images are independent of the host operating system, but they need to be built for specific processor architectures. Amd64 (x86-64) CPUs have dominated the market for a long time, but recently arm64 (AArch64) has been taking over. Notable examples include the Apple Silicon and AWS Graviton processors. We made the decision to start publishing ARM-compatible Docker images with our latest 6.1.0 release.
To ensure top performance, we have been load-testing MongooseIM for many years using our own tools, such as amoc and amoc-arsenal-xmpp.
When we tested the latest Docker image on both amd64 and arm64 AWS EC2 instances, the results turned out to be much better than before – especially for arm64. The tested MongooseIM cluster consisted of two nodes, which is less than the recommended production size of three nodes. But the goal was to determine the maximum capability of a simple installation. Various compute-optimized instances were tested – including the 5th, 6th and 7th generations, all in the xlarge size. PostgreSQL (db.m6g.xlarge) was used for persistent storage, and three Amoc nodes (m6g.xlarge) were used for load generation. The three best-performing instance types were c6id (Intel Xeon Scalable, amd64), c6gd (AWS Graviton2, arm64) and c7g (AWS Graviton3, arm64).
The two most important test scenarios were:
One-to-one messaging, where each user chats with their contacts.
Multi-user chat, where each user sends messages to chat rooms with 5 participants each.
Several extensions were enabled to resemble a real-life use case. The most important are:
Message Archive Management (MAM) – message archive, allowing users to query for incoming and outgoing messages.
Instance cost per billion delivered one-to-one chat messages (USD)
14.00
10.67
8.03
Instance cost per billion delivered multi-user chat messages (USD)
5.60
4.27
3.21
For each instance, the table shows the highest possible message rates achievable without performance degradation. The load was scaled up for the c7g instances thanks to their better performance, making it possible to handle 600k one-to-one messages per minute in the whole cluster, which is 300k messages per minute per node. Should you need more, you can scale horizontally or vertically, and further tests showed almost a linear increase of performance – of course there are limits (especially for the cluster size), but they are high. Maximum message rates for MUC Light were different because each message was routed to five recipients, making it possible to send up to 300k messages per minute, but deliver 1.5 million.
The results allowed calculating the costs of MongooseIM instances per 1 billion delivered messages, which are presented in the table above. Of course it might be difficult to reach these numbers in production environments because of the necessary margin for handling bursts of traffic, but during heavy load you can get close to these numbers. The database cost was actually higher than the cost of MongooseIM instances themselves.
C2S Process Rework
We have completely reimplemented the handling of C2S (client-to-server) connections. Although the changes are mostly internal, you can benefit from them, even if you are not interested in the implementation details.
The first change is about accepting incoming connections – instead of custom listener processes, the Ranch 2.1 library is now used. This introduces some new options, e.g. max_connections and reuse_port.
Prior to version 6.1.0, each open C2S connection was handled by two Erlang processes – the receiver process was responsible for XML parsing, while the C2S process would handle the decoded XML elements. They are now integrated into one, which means that the footprint of each session is smaller, and there is less internal messaging.
C2S State Machine: Separation of Concerns
The core XMPP operations are defined in RFC 6120, and we have reimplemented them from scratch in the new mongoose_c2s module. The most important benefit of this change from the user perspective is the vastly improved separation of concerns, making feature development much easier. A simplified version of the C2S state machine diagram is presented below. Error handling is omitted for simplicity. The “wait for session” state is optional, and you can disable it with the backwards_compatible_session configuration option.
A similar diagram for version 6.0 would be much more complicated, because the former implementation had parts of multiple extensions scattered around its code:
It is important to note that mod_presence is the only new module in the list. Others have existed before, but parts of their code were in the C2S module. By disabling unnecessary extensions, you can gain performance. For example, by omitting [mod_presence] from your configuration file you can skip all the server-side presence handling. Our load tests have shown that this could significantly reduce the total time needed to establish a connection. Moreover, disabling extensions is now 100% reliable and guarantees that no unwanted code would be executed.
Easier extension development
If you are interested in developing your custom extensions, it is now easier than ever, because mongoose_c2s uses the new C2S-related hooks and handlers and several new features of the gen_statem behaviour. C2S Hooks can be divided into the following categories, depending on the events that trigger them:
Most of the hooks are triggered by XMPP traffic. The only exception is foreign_event, which can be triggered by modules on demand, making it possible to execute code in context of a specific user’s C2S process.
Modules add handlers to selected hooks. Such a handler performs module-specific actions and returns an accumulator, which can contain special options, allowing the module to:
Store module-specific data using state_mod, or replace the whole C2S state data with c2s_data.
Transition to a new state with c2s_state.
Perform arbitrary gen_statem transition actions with actions.
Stop the state machine gracefully (stop) or forcefully (hard_stop).
Deliver XML elements to the user with (route, flush) or without triggering hooks (socket_send).
Example
Let’s take a look at the handlers of the new mod_presence module. For user_send_presence and user_receive_presence hooks, it updates the module-specific state (state_mod) storing the presence state. The handler for foreign_event is more complicated, because it handles the following events:
The example shows how the coupling between extension modules remains loose and modules don’t call each other’s code directly.
The benefits of gen_statem
The following new gen_statem features are used in mongoose_c2s:
Arbitrary term state – with the state_event_function callback mode it is possible to use tuples for state names. An example is {wait_for_sasl_response, cyrsasl:sasl_state(), retries()}, which has the state of the SASL authentication process and the number of authentication retries left encoded in the state tuple. Apart from the states shown in the diagram above, modules can introduce their own external states – they have the format {external, StateName}. An example is mod_stream_management, which causes transition to the {external, resume} state when a session is closed.
Multiple callback modules – to handle an external state, the callback module has to be changed, e.g. mod_stream_management uses the {push_callback_module, ?MODULE} transition action to provide its own handle_event function for the {external, resume} state.
State timeouts –for all states before wait_for_session, the session terminates after the configurable c2s_state_timeout. The timeout tuple itself is {state_timeout, Timeout, state_timeout_termination}.
Named timeouts – modules use these to trigger specific actions, e.g. mod_ping uses several timeouts to schedule ping requests and to wait for responses. The timeout tuple has the format {{timeout, ping | ping_timeout | send_ping}, Interval, fun ping_c2s_handler/2}. This feature is also used for traffic shaping to pause the state machine if the traffic volume exceeds the limit.
Self-generated events – this feature is used very often, for example when incoming XML data is parsed, an event {next_event, internal, XmlElement} is generated for each parsed XML element. The route and flush options of the c2s accumulator generate internal events as well.
Summary
MongooseIM 6.1.0 is full of improvements on many levels – both on the outside, like the arm64 Docker images, and deep inside, like the separation of concerns in mongoose_c2s. What is common for all of them is that we have load-tested them extensively, making sure that our new messaging server delivers what it promises and the performance is better than ever. There are no unpleasant surprises hidden underneath. After all, it is open source, and you are welcome to download, deploy, use and extend it free of charge. However, should you have a special use case, high performance requirements or want to reduce costs. Don’t hesitate to contact us, and we will be able to help you deploy, load test and maintain your messaging solution.
Even if Kaidan is making good progress, please keep in mind that it is not yet a stable app.
Do not expect it to work well on all supported systems.
Moreover, we do currently not consider Kaidan’s security as good as the security of the dominating chat apps.
All messages sent by Kaidan can be encrypted now.
If a contact supports the same encryption, Kaidan enables it by default.
Therefore, you do not have to enable it by yourself.
And you will also never need to worry about enabling it for new contacts.
But it is possible to disable it for each contact at any time.
Additionally, all metadata that is encryptable, such as typing notifications, is encrypted too.
The new Automatic Trust Management (ATM) makes trust management easier than before.
The details are explained in a previous post.
We worked hard on covering as many corner cases as possible.
Encrypted sessions are initialized in the background to reduce the loading time.
Kaidan even tries to repair sessions broken by other chat apps.
But if you discover any strange behavior, please let us know!
We decided to focus on future technologies.
Thus, Kaidan does not support OMEMO versions older than 0.8.1.
Unfortunately, many other clients do not support the latest version yet.
They only encrypt the body (text content) of a message, which is not compatible with newer OMEMO versions and ATM.
But we hope that other client developers will follow our lead soon.
XMPP Providers
Kaidan introduced an easy registration in version 0.5.
It used an own list of XMPP providers since then.
The new project XMPP Providers arose from that approach.
That project is intended to be used by various applications and services.
Kaidan is now one of them.
It uses XMPP Providers for its registration process instead of maintaining an own list of providers.
Try it out and see how easy it can be to get an XMPP account with Kaidan!
Changelog
This release adds the following features:
End-to-end encryption with OMEMO 2 for messages, files and metadata including an easy trust management
XMPP Providers support for an easy onboarding
Message reactions for sending emojis upon a message
Read markers showing which messages a contact has read
Message drafts to send entered messages later after switching chats or restarting Kaidan
Message search for messages that are not yet loaded
New look of the chat background and message bubbles including grouped messages from the same author
Chat pinning for reordering chats
Public group chat search (without group chat support yet)
New contact and account details including the ability to change the own profile picture
Welcome to the XMPP Newsletter, great to have you here again! This issue covers the month of April 2023.
Many thanks to all our readers and all contributors!
Like this newsletter, many projects and their efforts in the XMPP community are a result of people’s voluntary work. If you are happy with the services and software you may be using, please consider saying thanks or help these projects! Interested in supporting the Newsletter team? Read more at the bottom.
XSF Announcements
xmpp.org got a new software section! Looking for XMPP software, i.e. clients, servers, libraries, components, and tools? Check out xmpp.org’s new software section, which lets you filter software by your own criteria. Looking for a client which works on Android and supports audio/video calls? Looking for a library that supports XEP-0461: Message Replies? Just apply the filter and see what you get!
Berlin XMPP Meetup (remote): monthly meeting of XMPP enthusiasts in Berlin, every 2nd Wednesday of the month
FOSSY will have an XMPP track at their conference this summer. Please submit talk proposals by May 14th. The track organizers offer financial support for presenters, if needed.
XMPP Italian happy hour: monthly Italian XMPP web meeting, starting May 16th and then every third Tuesday of the month at 7:00 PM. Online event, with web meeting mode and live streaming.
XMPP Sprints
Elbe-Sprint Hamburg 2023: Thursday, 22-06-2023 18:00 CEST — Sunday, 25-06-2023 12:00 CEST.
This summer, XMPP developers are holding a development sprint in Hamburg, Germany.
XMPP Videos
Axel Reimer published German video tutorials in his blog eversten.net.
One video [DE] explains some main aspects of XMPP.
A series of four videos [DE] explains how iOS users can start using XMPP by installing and configuring the messenger app Monal.
Axel Reimer introduced a new German website called xmpp24.de [DE]. This website focuses on helping new XMPP users who want to start using XMPP on their Android or iOS devices. It explains one onboarding flow (as a video tutorial) per operating system.
The JMP April Newsletter talks about several developments, including a new MMS stack in testing, integration with the Quicksy directory, the ability to create Snikket instances from inside a Cheogram Android onboarding flow, and an experimental WebXDC prototype.
Cheogram onboarding view
Software news
Clients and applications
Gajim 1.7.3 has been released. This release enables you to mute notifications for specific contacts and brings some improvements and bug fixes.
Servers
ejabberd 23.04 has been released. This a big new release with many changes including support XEP-0425 (Message moderation), Real-Time Block List for MUC rooms and several SQL improvements.
Web Console Chat has been released. This is an installation guide and a collection of patches to make sure existing XMPP console clients are safe enough before serving them to the web with ttyd for your web-chat service.
Extensions and specifications
The XMPP Standards Foundation develops extensions to XMPP in its XEP series in addition to XMPP RFCs.
Developers and other standards experts from around the world collaborate on these extensions, developing new specifications for emerging practices, and refining existing ways of doing things. Proposed by anybody, the particularly successful ones end up as Final or Active - depending on their type - while others are carefully archived as Deferred. This life cycle is described in XEP-0001, which contains the formal and canonical definitions for the types, states, and processes. Read more about the standards process. Communication around Standards and Extensions happens in the Standards Mailing List (online archive).
Proposed
The XEP development process starts by writing up an idea and submitting it to the XMPP Editor. Within two weeks, the Council decides whether to
accept this proposal as an Experimental XEP.
No XEPs proposed this month.
New
No new XEPs this month.
Deferred
If an experimental XEP is not updated for more than twelve months, it will be moved off Experimental to Deferred. If there is another update, it
will put the XEP back onto Experimental.
Various changes, made in parallel with working client and server implementation experience, and SASL2 updates.
More tightly define the integration with XEP-0388 and several session feature XEPs: XEP-0198, XEP-0280, XEP-0352.
Replace the custom latest-id element with the new metadata element from XEP-0313, which also provides richer information.
Drop unread tracking, as this is a deep topic not directly related to resource binding. Instead the details of integration with other extensions have been better defined and demonstrated, to allow such functionality when it is fully defined and exists.
Adjust proposed namespace on aesthetic grounds and consistency with SASL2’s approach. As this protocol may become part of the new preferred connection flow for a long time to come, it makes no sense to include the redundant and potentially confusing ‘2’ when there is no conflict without it. Similarly, the ‘.0’ has been dropped from the XEP’s title, as it isn’t really a version number.
Allow the client some influence over the resulting resource identifier, and define a standard format for these combined identifiers.
Specify that servers should terminate old sessions from a client when it binds a new resource. (mw)
Remove raw-IQ mode and specifies the reuse of PEP (spw)
Last Call
Last calls are issued once everyone seems satisfied with the current XEP status. After the Council decides whether the XEP seems ready, the XMPP Editor issues a Last Call for comments. The feedback gathered during the Last Call help improving the XEP before returning it to the Council for advancement to Stable.
No Last Call this month.
Stable
No XEP moved to stable this month.
Deprecated
No XEP deprecated this month.
Call for Experience
A Call For Experience - like a Last Call, is an explicit call for comments, but in this case it’s mostly directed at people who’ve implemented, and ideally deployed, the specification. The Council then votes to move it to Final.
Looking for job offers or want to hire a professional consultant for your XMPP project? Visit our XMPP job board.
Newsletter Contributors & Translations
This is a community effort, and we would like to thank translators for their contributions. Volunteers are welcome! Translations of the XMPP Newsletter will be released here (with some delay):
This XMPP Newsletter is produced collaboratively by the XMPP community. Each month’s newsletter issue is drafted in this simple pad. At the end of each month, the pad’s content is merged into the XSF Github repository. We are always happy to welcome contributors. Do not hesitate to join the discussion in our Comm-Team group chat (MUC) and thereby help us sustain this as a community effort. You have a project and want to spread the news? Please consider sharing your news or events here, and promote it to a large audience.
Tasks we do on a regular basis:
gathering news in the XMPP universe
short summaries of news and events
summary of the monthly communication on extensions (XEPs)
Welcome to the latest edition of your pseudo-monthly JMP update!
In case it’s been a while since you checked out JMP, here’s a refresher: JMP lets you send and receive text and picture messages (and calls) through a real phone number right from your computer, tablet, phone, or anything else that has a Jabber client. Among other things, JMP has these features: Your phone number on every device; Multiple phone numbers, one app; Free as in Freedom; Share one number with multiple people.
It has been a while since we got a newsletter out, and lots has been happening as we race towards our launch.
For those who have experienced the issue with Google Voice participants not showing up properly in our MMS group texting stack, we have a new stack in testing right now. Let support know if you want to try it out, it has been working well so far for those already using it.
If you check your account settings for the “refer a friend” option you will now see two kinds of referral code. The list of one-time use codes remains the same as always: a free month for your friend, and a free month’s worth of credit for you if they start paying. The new code up in the top is multi-use and you can post and share it as much as you like. It provides credit equivalent to an additional month to anyone who uses it on sign up after their initial $15 deposit as normal, and then a free month’s worth of credit for you after that payment fully clears.
We mentioned before that much of the team will be present at FOSSY, and we can now reveal why: there will be a conference track dedicated to XMPP, which we are helping to facilitate! Call for proposals ends May 14th. Sign up and come out this summer!
For quite some time now, customers have been asked while registering if they would like to enable others who know their phone number to discover their Jabber ID, to enable upgrading to end-to-end encryption, video calls, etc. The first version of this feature is now live, and users of at least Cheogram Android and Movim can check the contact details of anyone they exchange SMS with to see if a Jabber ID is listed. We are happy to announce that we have also partnered with Quicksy to allow discovery of anyone registered for their app or directory as well.
Jabber-side reactions are now translated where possible into the tapback pseudo-syntax recognized by many Android and iMessage users so that your reactions will appear in a native way to those users. In Cheogram Android you can swipe to reply to a message and enter a single emoji as the reply to send a reaction/tapback.
There have been two Cheogram Android releases since our last newsletter, with a third coming out today. You no longer need to add a contact to send a message or initiate a call. The app has seen the addition of moderation features for channel administrators, as well as respecting these moderation actions on display. For offensive media arriving from other sources, in avatars, or just not moderated quickly enough, users also have the ability to permanently block any media they see from their device.
Cheogram Android has seen some new sticker-related features including default sticker packs and the ability to import any sticker pack made for signal (browse signalstickers.com to find more sticker packs, just tap “add to signal” to add them to Cheogram Android).
There are also brand-new features today in 2.12.1-5, including a new onboarding flow that allows new users to register and pay for JMP before getting a Jabber ID, and then set up their very own Snikket instance all from within the app. This flow also features some new introductory material about the Jabber network which we will continue to refine over time:
Notifications about new messages now use the conversation style in Android. This means that you can set seperate priority and sounds per-conversation at the OS level on new enough version of Android. There is also an option in each conversation’s menu to add that conversation to your homescreen, something that has always been possible with the app but hopefully this makes it more discoverable for some.
For communities organizing in Jabber channels, sometimes it can be useful to notify everyone present about a message. Cheogram Android now respects the attention element from members and higher in any channel or group chat. To send a message with this priority attached, start the message body with @here (this will not be included in the actual message people see).
This release also brings an experimental prototype supporting WebXDC. This is an experimental specification to allow developers to ship mini-apps that work inside your chats. Take any *.xdc file and send it to a contact or group chat where everyone uses Cheogram Android and you can play games, share notes, shopping lists, calendars, and more. Please come by the channel to discuss the future of this technology on the Jabber network with us.
To learn what’s happening with JMP between newsletters, here are some ways you can find out:
This major release adds significant new functionality and improvements to Red/Black, a management tool that allows you to monitor and control devices and servers across a network, with a particular focus on HF Radio Systems. A general summary is given in the white paperRed/Black Overview
Switch Device
Support added for Switch type devices, that can connect multiple devices and allow an operator (red or black side) to change switch connections. Physical switch connectivity is configured by an administrator. The switch column can be hidden, so that logical connectivity through the switch is shown.
SNMP Support
A device driver for SNMP devices is provided, including SNMPv3 authorization. Abstract devices specifications are included in Red/Black for:
SNMP System MIB
SNMP Host MIB
SNMP UPS MIB
Leonardo HF 2000 radio
IES Antenna Switch
eLogic Radio Gateway
Abstract devices specifications can be configured for other devices with suitable SNMP MIBs.
Further details provided in the Isode WP “Managing SNMP Devices in Red/Black“.
Alert Handling
The UI shows all devices that have Alerts which have not been handled by operator. The UI enables an operator to see all un-handled alerts for a device and gives the ability to mark some or all alerts as handled.
Device Parameter Display and Management
A number of improvements have been made to the way device parameters are handled:
Improved general parameter display
Display in multiple columns, with selectable number of columns and choice of style, to better support devices with large numbers of parameters
Parameter grouping
Labelled integer support, so that semantics can be added to values
Configurable Colours
Display of parameter Units
Configurable parameter icons
Optimized UI for Device refresh; enable/disable; power off; and reset
Integer parameters can specify “interval”
Parameters with limited integer values can be selected as drop down
Top Screen Display
The top screen display is improved.
Modes of “Device” (monitoring) and “Connectivity” with UIs optimized for these functions
Reduced clutter when no device is being examined
Allow columns to be hidden/restored so that the display can be tuned to operator needs
Show selected device parameters on top screen so that operator can see critical device parameters without needing to inspect the device details
UI clearly shows which links user can modify, according to operator or administrator rights
This ejabberd 22.10 release includes six months of work, over 140 commits, including relevant improvements in MIX, MUC, SQL, and installers, and bug fixes as usual.
This version brings support for latest MIX protocol version, and significantly improves detection and recovery of SQL connection issues.
There are no breaking changes in SQL schemas, configuration, or commands API. If you develop an ejabberd module, notice two hooks have changed: muc_subscribed and muc_unsubscribed.
A more detailed explanation of those topics and other features:
Erlang/OTP 19.3
You may remember than in the previous ejabberd release, ejabberd 22.05, support for Erlang/OTP 25 was introduced, even if 24.3 is still recommended for stable deployments.
It is expected that around April 2023, GitHub Actions will remove Ubuntu 18 and it will not be possible to run automatic tests for ejabberd using Erlang 19.3, the lowest possible will be Erlang 20.0.
For that reason, the planned schedule is:
ejabberd 22.10
Usage of Erlang 19.3 is discouraged
Anybody still using Erlang 19.3 is encouraged to upgrade to 24.3, or at least 20.0
ejabberd 23.05 (or later)
Support for Erlang 19.3 is deprecated
Erlang requirement softly increased in configure.ac
Announce: no warranty ejabberd can compile, start or pass the Common Tests suite using Erlang 19.3
Provide instructions for anybody to manually re-enable it and run the tests
ejabberd 23.xx+1 (or later)
Support for Erlang 19.3 is removed completely in the source code
New log_burst_limit_* options
Two options were added in #3865 to configure logging limits in case of high traffic:
log_burst_limit_window_time defines the time period to rate-limit log messages by.
log_burst_limit_count defines the number of messages to accept in that time period before starting to drop them.
Support ERL_DIST_PORT option to work without epmd
The option ERL_DIST_PORT is added to ejabberdctl.cfg, disabled by default.
When this option is set to a port number, the Erlang node will not start epmd and will not listen in a range of ports for erlang connections (typically used for ejabberdctl and for clustering). Instead, the erlang node will simply listen in the configured port number.
Please note:
Erlang/OTP 23.1 or higher is required to use ERL_DIST_PORT
make relive doesn’t support ERL_DIST_PORT, neither rebar3 nor elixir
To start several ejabberd nodes in the same machine, configure a different port in each node
Support version macros in captcha_cmd option
Support for the @VERSION@ and @SEMVER@ macros was added to the captcha_cmd option in #3835.
Those macros are useful because the example captcha scripts are copied in a path like ejabberd-VERSION/priv/bin that depends on the ejabberd version number and changes for each release. Also, depending on the install method (rebar3 or Elixir’s mix), that VERSION may be in XX.YY or in SEMVER format (respectively).
Two hooks have changed: muc_subscribed and muc_unsubscribed. Now they get the packet and room state, and can modify the sent packets. If you write source code that adds functions to those hooks, please notice that previously they were ran like:
being Packet1b a copy of Packet1a without the jid attribute in the muc_subscribe element.
Translations Updates
Several translations were improved: Ukrainian, Chinese (Simplified), French, German, Russian, Portuguese (Brazil), Spanish and Catalan. Thanks to all this people that contribute in ejabberd at Weblate!
WebAdmin page for external modules
A new page is added in ejabberd’s WebAdmin to view available external modules, update their source code, install, upgrade and remove them. All this is equivalent to what was already available using API commands from the modules tag.
Many modules in the ejabberd-contrib git repository have been improved, and their documentation updated. Additionally, those modules are now automatically tested, at least compilation, installation and static code analysis.
Documentation Improvements
In addition to the normal improvements and fixes, two sections in the ejabberd Documentation are greatly improved:
It has been almost four years since my first article about scraping with Elixir and Crawly was published. Since then, many changes have occurred, the most significant being Erlang Solution’s blog design update. As a result, the 2019 tutorial is no longer functional.
This situation provided an excellent opportunity to update the original work and re-implement the Crawler using the new version of Crawly. By doing so, the tutorial will showcase several new features added to Crawly over the years and, more importantly, provide a functional version to the community. Hopefully, this updated tutorial will be beneficial to all.
First of all, why it’s broken now?
This situation is reasonably expected! When a website gets a new design, usually they redo everything—the new layout results in a new HTML which makes all old CSS/XPath selectors obselete, not even speaking about new URL schemes. As a result, the XPath/CSS selectors that were working before referred to nothing after the redesign, so we have to start from the very beginning. What a shame!
But of course, the web is done for more than just crawling. The web is done for people, not robots, so let’s adapt our robots!
Our experience from a large-scale scraping platform is that a successful business usually runs at least one complete redesign every two years. More minor updates will occur even more often, but remember that even minor updates harm your web scrapers.
Getting started
Usually, I recommend starting by following the Quickstart guide from Crawly’s documentation pages. However, this time I have something else in mind. I want to show you the Crawly standalone version.
Make it simple. In some cases, you need the data that can be extracted from a relatively simple source. In these situations, it might be quite beneficial to avoid bootstrapping all the Elixir stuff (new project, config, libs, dependencies). The idea is to deliver you data that other applications can consume without setting up.
Of course, the approach will have some limitations and only work for simple projects at this stage. Some may get inspired by this article and improve it so that the following readers will be amazed by new possibilities. In any case, let’s get straight to it now!
Bootstrapping 2.0
As promised, the simplified (compare it with the previous setup described here)version of the setup:
Create a directory for your project: mkdir erlang_solutions_blog
Create a subdirectory that will contain the code of your spiders: mkdir erlang_solutions_blog/spiders
Now, knowing that we want to extract the following fields: title, author, publishing_date, URL, article_body. Let’s define the following configuration for your project (erlang_solutions_blog/crawly.config):
You probably have noticed that this looks like an Erlang configuration file, which is the case. I would say that it’s not the perfect solution, and one of the possible ways is to simplify it so it’s possible to configure the project more simply. If you have ideas — write me on Github’s discussions https://github.com/elixir-crawly/crawly/discussions.
4. The basic configuration is now done, and we can run the Crawly application, to see that we can start it this way:
4001 — is the default HTTP port used for spiders management, so we need to forward data to it
The spiders’ directory is an expected storage of spider files that will be added to the application later on.
Finally, the ugly configuration file is also mounted inside the Crawly container.
Now you can see the Crawly Management User interface on the localhost:4001
Crawly Management Tool
Working on a new spider
Now, let’s define the spider itself. Let’s start with the following boilerplate code (put it into erlang_solutions_blog/spiders/esl.ex):
defmodule ESLSpider do
use Crawly.Spider
@impl Crawly.Spider
def init() do
[start_urls: ["https://www.erlang-solutions.com/"]]
end
@impl Crawly.Spider
def base_url(), do: "https://www.erlang-solutions.com"
@impl Crawly.Spider
def parse_item(response) do
%{items: [], requests: []}
end
end
This code defines an “ESLSpider ” module that uses the “Crawly.Spider” behavior.
The behavior requires three functions to be implemented:
teinit(), base_url(), and parse_item(response).
The “init()” function returns a list containing a single key-value pair. The key is “start_urls” and the value is a list containing a single URL string: “https://www.erlang-solutions.com/” This means that the spider will start crawling from this URL.
The “base_url()” function returns a string representing the base URL for the spider, used to filter out requests that go outside of erlang-solutions.com website.
The `parse_item(response)` function takes a response object as an argument and returns a map containing two keys: `items` and `requests`
Once the code is saved, we can run it via the Web interface (it will be required to re-start a docker container or click the Reloadspiders button in the Web interface).
Crawly Management Tool
Working on a new spider
Now, let’s define the spider itself. Let’s start with the following boilerplate code (put it into erlang_solutions_blog/spiders/esl.ex):
defmodule ESLSpider do
use Crawly.Spider
@impl Crawly.Spider
def init() do
[start_urls: ["https://www.erlang-solutions.com/"]]
end
@impl Crawly.Spider
def base_url(), do: "https://www.erlang-solutions.com"
@impl Crawly.Spider
def parse_item(response) do
%{items: [], requests: []}
end
end
This code defines an “ESLSpider ” module that uses the “Crawly.Spider” behavior.
The behavior requires three functions to be implemented:
teinit(), base_url(), and parse_item(response).
The “init()” function returns a list containing a single key-value pair. The key is “start_urls” and the value is a list containing a single URL string: “https://www.erlang-solutions.com/” This means that the spider will start crawling from this URL.
The “base_url()” function returns a string representing the base URL for the spider, used to filter out requests that go outside of erlang-solutions.com website.
The `parse_item(response)` function takes a response object as an argument and returns a map containing two keys: `items` and `requests`
Once the code is saved, we can run it via the Web interface (it will be required to re-start a docker container or click the Reloadspiders button in the Web interface).
New Crawly Management UI
Once the job is started, you can review the Scheduled Requests, Logs, or Extracted Items.
Parsing the page
Now we find CSS selectors to extract the needed data. The same approach is already described here https://www.erlang-solutions.com/blog/web-scraping-with-elixir/ under extracting the data section. I think one of the best ways to find relevant CSS selectors is by just using Google Chrome’s inspect option:
So let’s connect to the Crawly Shell and fetch data using the fetcher, extracting this title:
docker exec -it crawly /app/bin/crawly remote
1> response = Crawly.fetch("https://www.erlang-solutions.com/blog/web-scraping-with-elixir/")
2> document = Floki.parse_document!(response.body)
4> title_tag = Floki.find(document, ".page-title-sm")
[{"h1", [{"class", "page-title-sm mb-sm"}], ["Web scraping with Elixir"]}]
5> title = Floki.text(title_tag)
"Web scraping with Elixir"
We are going to extract all items this way. In the end, we came up with the following map of selectors representing the expected item:
Well, not really. Let’s schedule this version of the spider again, and let’s see the results:
Scraping results
As you can see, the spider could only extract 34 items. This is quite interesting, as it’s pretty clear that Erlang Solution’s blog contains way more items. So why do we have only this amount? Can anything be done to improve it?
Debugging your spider
Some intelligent developers write everything just once, and everything works. Other people like me have to spend time debugging the code.
In my case, I start with exploring logs. There is something there I don’t like:
08:23:37.417 [info] Dropping item: %{article_body: “Scalable and Reliable Real-time MQTT Messaging Engine for IoT in the 5G Era.We work with proven, world leading technologies that provide a highly scalable, highly available distributed message broker for all major IoT protocols, as well as M2M and mobile applications.Available virtually everywhere with real-time system monitoring and management ability, it can handle tens of millions of concurrent clients.Today, more than 5,000 enterprise users are trusting EMQ X to connect more than 50 million devices.As well as being trusted experts in EMQ x, we also have 20 years of experience building reliable, fault-tolerant, real-time distributed systems. Our experts are able to guide you through any stage of the project to ensure your system can scale with confidence. Whether youâ€re hunting for a suspected bug, or doing due diligence to future proof your system, weâ€re here to help. Our world-leading team will deep dive into your system providing an in-depth report of recommendations. This gives you full visibility on the vulnerabilities of your system and how to improve it. Connected devices play an increasingly vital role in major infrastructure and the daily lives of the end user. To provide our clients with peace of mind, our support agreements ensure an expert is on hand to minimise the length and damage in the event of a disruption. Catching a disruption before it occurs is always cheaper and less time consuming. WombatOAM is specifically designed for the monitoring and maintenance of BEAM-based systems (including EMQ x). This provides you with powerful visibility and custom alerts to stop issues before they occur. As well as being trusted experts in EMQ x, we also have 20 years of experience building reliable, fault-tolerant, real-time distributed systems. Our experts are able to guide you through any stage of the project to ensure your system can scale with confidence. Whether youâ€re hunting for a suspected bug, or doing due diligence to future proof your system, weâ€re here to help. Our world-leading team will deep dive into your system providing an in-depth report of recommendations. This gives you full visibility on the vulnerabilities of your system and how to improve it. Connected devices play an increasingly vital role in major infrastructure and the daily lives of the end user. To provide our clients with peace of mind, our support agreements ensure an expert is on hand to minimise the length and damage in the event of a disruption. Catching a disruption before it occurs is always cheaper and less time consuming. WombatOAM is specifically designed for the monitoring and maintenance of BEAM-based systems (including EMQ x). This provides you with powerful visibility and custom alerts to stop issues before they occur. Because itâ€s written in Erlang!With itâ€s Erlang/OTP design, EMQ X fuses some of the best qualities of Erlang. A single node broker can sustain one million concurrent connections…but a single EMQ X cluster – which contains multiple nodes – can support tens of millions of concurrent connections. Inside this cluster, routing and broker nodes are deployed independently to increase the routing efficiency. Control channels and data channels are also separated – significantly improving the performance of message forwarding. EMQ X works on a soft real-time basis. No matter how many simultaneous requests are going through the system, the latency is guaranteed.Hereâ€s how EMQ X can help with your IoT messaging needs?Erlang Solutions exists to build transformative solutions for the worldâ€s most ambitious companies, by providing user-focused consultancy, high tech capabilities and diverse communities. Letâ€s talk about how we can help you.”, author: “”, publishing_date: “”, title: “”, url: “https://www.erlang-solutions.com/capabilities/emqx/”}. Reason: missing required fields
The line above indicates that the spider has dropped an article, which is not an article but is a general page. We want to exclude these URLs from the route of our bot.
Try to avoid creating unnecessary loads on a website when doing crawling activities.
Now, we can re-run the spider and see that we’re not hitting non-blog pages anymore (don’t forget to reload the spider’s code)!
This optimised our crawler, but more was needed to extract more items. (Besides other things, it’s interesting to note that we can only get 35 articles from the “Keep reading” blog, which indicates some possible directions for improving the cross-linking inside the blog itself).
Improving the extraction coverage
When looking at the possibility of extracting more items, we should try finding a better source of links. One good way to do it is by exploring the blog’s homepage, potentially with JavaScript turned off. Here is what I can see:
Sometimes you need to switch JavaScript off to see more.
As you can see, there are 14 Pages (only 12 of which are working), and every page contains nine articles. So we expect ~100–108 articles in total.
So let’s try to use this pagination as a source of new links! I have updated the init() function, so it refers the blog’s index, and also parse_item so it can use the information found there:
Now, finally, after adding all fixes, let’s reload the code and re-run the spider:
So as you can see, we have extracted 114 items, which looks quite close to what we expected!
Conclusion
Honestly speaking — running an open-source project is a complex thing. We have spent almost four years building Crawly and progressed quite a bit with the possibilities. Adding some bugs as well.
The example above shows how to run something with Elixir/Floki and a bit more complex process of debugging and fixing that sometimes appears in practice.
We want to thank Erlang Solutions for supporting the development and allocating help when needed!
New mod_muc_rtbl, Real-Time Block List for MUC rooms
Binaries use Erlang/OTP 25.3, and changes in containers
A more detailed explanation of these topics and other features:
Many improvements to SQL databases
There are many improvements in the area of SQL databases (see #3980 and #3982):
Added support for migrating MySQL and MS SQL to new schema, fixed a long-standing bug, and many other improvements.
Regarding MS SQL, there are schema fixes, added support for new schema and the corresponding schema migration, along with other minor improvements and bugfixes.
The automated ejabberd tests now also run on updated schema databases, and support for running tests on MS SQL has been added.
and other minor SQL schema inconsistencies, removed unnecessary indexes and changed PostgreSQL SERIAL columns to BIGSERIAL columns.
Please upgrade your existing SQL database, check the notes later in this document!
Added mod_mam support for XEP-0425: Message Moderation
XEP-0425: Message Moderation allows a Multi-User Chat (XEP-0045) moderator to moderate certain group chat messages, for example by removing them from the group chat history, as part of an effort to address and resolve issues such as message spam, inappropriate venue language, or revealing private personal information of others. It also allows moderators to correct a message on another user’s behalf, or flag a message as inappropriate, without having to retract it.
This new module implements Real-Time Block List for MUC rooms. It works by monitoring remote pubsub nodes according to the specification described in xmppbl.org.
Now captcha_url gets an improvement: if set to auto, it tries to detect the URL automatically, taking into account the ejabberd configuration. This is now the default. This should be good enough in most cases, but manually setting the URL may be necessary when using port forwarding or very specific setups.
Erlang/OTP 19.3 is deprecated
This is the last ejabberd release with support for Erlang/OTP 19.3. If you have not already done so, please upgrade to Erlang/OTP 20.0 or newer before the next ejabberd release. See the ejabberd 22.10 release announcement for more details.
About the binary packages provided for ejabberd:
The binary installers and container images now use Erlang/OTP 25.3 and Elixir 1.14.3.
The mix, ecs and ejabberd container images now use Alpine 3.17.
The ejabberd container image now supports an alternative build method, useful to work around a problem in QEMU and Erlang 25 when building the image for the arm64 architecture.
Erlang node name in ecs container image
The ecs container image is built using the files from docker-ejabberd/ecs and published in docker.io/ejabberd/ecs. This image generally gets only minimal fixes, no major or breaking changes, but in this release it got one change that requires administrator intervention.
The Erlang node name is now fixed to ejabberd@localhost by default, instead of being variable based on the container hostname. If you previously allowed ejabberd to choose its node name (which was random), it will now create a new mnesia database instead of using the previous one:
$ docker exec -it ejabberd ls /home/ejabberd/database/
ejabberd@1ca968a0301a
ejabberd@localhost
...
A simple solution is to create a container that provides ERLANG_NODE_ARG with the old erlang node name, for example:
docker run ... -e ERLANG_NODE_ARG=ejabberd@1ca968a0301a
In addition to the previously mentioned change to the default erlang node name, the ecs container image has received other improvements:
For each commit to the docker-ejabberd repository that affects ecs and mix container images, those images are uploaded as artifacts and are available for download in the corresponding runs.
When a new release is tagged in the docker-ejabberd repository, the image is automatically published to ghcr.io/processone/ecs, in addition to being manually published to the Docker Hub.
And also to all the people who help solve doubts and problems in the ejabberd chatroom and issue tracker.
Updating SQL Databases
These notes allow you to apply the SQL database schema improvements in this ejabberd release to your existing SQL database. Please consider which database you are using and whether it is the default or the new schema.
PostgreSQL new schema:
Fixes a long-standing bug in the new schema on PostgreSQL. The fix for all existing affected installations is the same:
ALTER TABLE vcard_search DROP CONSTRAINT vcard_search_pkey;
ALTER TABLE vcard_search ADD PRIMARY KEY (server_host, lusername);
PosgreSQL default or new schema:
To convert columns to allow up to 2 billion rows in these tables. This conversion requires full table rebuilds and will take a long time if the tables already have many rows. Optional: This is not necessary if the tables will never grow large.
ALTER TABLE archive ALTER COLUMN id TYPE BIGINT;
ALTER TABLE privacy_list ALTER COLUMN id TYPE BIGINT;
ALTER TABLE pubsub_node ALTER COLUMN nodeid TYPE BIGINT;
ALTER TABLE pubsub_state ALTER COLUMN stateid TYPE BIGINT;
ALTER TABLE spool ALTER COLUMN seq TYPE BIGINT;
PostgreSQL or SQLite default schema:
DROP INDEX i_rosteru_username;
DROP INDEX i_sr_user_jid;
DROP INDEX i_privacy_list_username;
DROP INDEX i_private_storage_username;
DROP INDEX i_muc_online_users_us;
DROP INDEX i_route_domain;
DROP INDEX i_mix_participant_chan_serv;
DROP INDEX i_mix_subscription_chan_serv_ud;
DROP INDEX i_mix_subscription_chan_serv;
DROP INDEX i_mix_pam_us;
PostgreSQL or SQLite new schema:
DROP INDEX i_rosteru_sh_username;
DROP INDEX i_sr_user_sh_jid;
DROP INDEX i_privacy_list_sh_username;
DROP INDEX i_private_storage_sh_username;
DROP INDEX i_muc_online_users_us;
DROP INDEX i_route_domain;
DROP INDEX i_mix_participant_chan_serv;
DROP INDEX i_mix_subscription_chan_serv_ud;
DROP INDEX i_mix_subscription_chan_serv;
DROP INDEX i_mix_pam_us;
And now add index that might be missing
In PostgreSQL:
CREATE INDEX i_push_session_sh_username_timestamp ON push_session USING btree (server_host, username, timestamp);
In SQLite:
CREATE INDEX i_push_session_sh_username_timestamp ON push_session (server_host, username, timestamp);
MySQL default schema:
ALTER TABLE rosterusers DROP INDEX i_rosteru_username;
ALTER TABLE sr_user DROP INDEX i_sr_user_jid;
ALTER TABLE privacy_list DROP INDEX i_privacy_list_username;
ALTER TABLE private_storage DROP INDEX i_private_storage_username;
ALTER TABLE muc_online_users DROP INDEX i_muc_online_users_us;
ALTER TABLE route DROP INDEX i_route_domain;
ALTER TABLE mix_participant DROP INDEX i_mix_participant_chan_serv;
ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv_ud;
ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv;
ALTER TABLE mix_pam DROP INDEX i_mix_pam_u;
MySQL new schema:
ALTER TABLE rosterusers DROP INDEX i_rosteru_sh_username;
ALTER TABLE sr_user DROP INDEX i_sr_user_sh_jid;
ALTER TABLE privacy_list DROP INDEX i_privacy_list_sh_username;
ALTER TABLE private_storage DROP INDEX i_private_storage_sh_username;
ALTER TABLE muc_online_users DROP INDEX i_muc_online_users_us;
ALTER TABLE route DROP INDEX i_route_domain;
ALTER TABLE mix_participant DROP INDEX i_mix_participant_chan_serv;
ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv_ud;
ALTER TABLE mix_participant DROP INDEX i_mix_subscription_chan_serv;
ALTER TABLE mix_pam DROP INDEX i_mix_pam_us;
Add index that might be missing:
CREATE INDEX i_push_session_sh_username_timestamp ON push_session (server_host, username(191), timestamp);
MS SQL
DROP INDEX [rosterusers_username] ON [rosterusers];
DROP INDEX [sr_user_jid] ON [sr_user];
DROP INDEX [privacy_list_username] ON [privacy_list];
DROP INDEX [private_storage_username] ON [private_storage];
DROP INDEX [muc_online_users_us] ON [muc_online_users];
DROP INDEX [route_domain] ON [route];
go
MS SQL schema was missing some tables added in earlier versions of ejabberd:
CREATE TABLE [dbo].[mix_channel] (
[channel] [varchar] (250) NOT NULL,
[service] [varchar] (250) NOT NULL,
[username] [varchar] (250) NOT NULL,
[domain] [varchar] (250) NOT NULL,
[jid] [varchar] (250) NOT NULL,
[hidden] [smallint] NOT NULL,
[hmac_key] [text] NOT NULL,
[created_at] [datetime] NOT NULL DEFAULT GETDATE()
) TEXTIMAGE_ON [PRIMARY];
CREATE UNIQUE CLUSTERED INDEX [mix_channel] ON [mix_channel] (channel, service)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE INDEX [mix_channel_serv] ON [mix_channel] (service)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE TABLE [dbo].[mix_participant] (
[channel] [varchar] (250) NOT NULL,
[service] [varchar] (250) NOT NULL,
[username] [varchar] (250) NOT NULL,
[domain] [varchar] (250) NOT NULL,
[jid] [varchar] (250) NOT NULL,
[id] [text] NOT NULL,
[nick] [text] NOT NULL,
[created_at] [datetime] NOT NULL DEFAULT GETDATE()
) TEXTIMAGE_ON [PRIMARY];
CREATE UNIQUE INDEX [mix_participant] ON [mix_participant] (channel, service, username, domain)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE INDEX [mix_participant_chan_serv] ON [mix_participant] (channel, service)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE TABLE [dbo].[mix_subscription] (
[channel] [varchar] (250) NOT NULL,
[service] [varchar] (250) NOT NULL,
[username] [varchar] (250) NOT NULL,
[domain] [varchar] (250) NOT NULL,
[node] [varchar] (250) NOT NULL,
[jid] [varchar] (250) NOT NULL
);
CREATE UNIQUE INDEX [mix_subscription] ON [mix_subscription] (channel, service, username, domain, node)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE INDEX [mix_subscription_chan_serv_ud] ON [mix_subscription] (channel, service, username, domain)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE INDEX [mix_subscription_chan_serv_node] ON [mix_subscription] (channel, service, node)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE INDEX [mix_subscription_chan_serv] ON [mix_subscription] (channel, service)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
CREATE TABLE [dbo].[mix_pam] (
[username] [varchar] (250) NOT NULL,
[channel] [varchar] (250) NOT NULL,
[service] [varchar] (250) NOT NULL,
[id] [text] NOT NULL,
[created_at] [datetime] NOT NULL DEFAULT GETDATE()
) TEXTIMAGE_ON [PRIMARY];
CREATE UNIQUE CLUSTERED INDEX [mix_pam] ON [mix_pam] (username, channel, service)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
go
MS SQL also had some incompatible column types:
ALTER TABLE [dbo].[muc_online_room] ALTER COLUMN [node] VARCHAR (250);
ALTER TABLE [dbo].[muc_online_room] ALTER COLUMN [pid] VARCHAR (100);
ALTER TABLE [dbo].[muc_online_users] ALTER COLUMN [node] VARCHAR (250);
ALTER TABLE [dbo].[pubsub_node_option] ALTER COLUMN [name] VARCHAR (250);
ALTER TABLE [dbo].[pubsub_node_option] ALTER COLUMN [val] VARCHAR (250);
ALTER TABLE [dbo].[pubsub_node] ALTER COLUMN [plugin] VARCHAR (32);
go
… and mqtt_pub table was incorrectly defined in old schema:
ALTER TABLE [dbo].[mqtt_pub] DROP CONSTRAINT [i_mqtt_topic_server];
ALTER TABLE [dbo].[mqtt_pub] DROP COLUMN [server_host];
ALTER TABLE [dbo].[mqtt_pub] ALTER COLUMN [resource] VARCHAR (250);
ALTER TABLE [dbo].[mqtt_pub] ALTER COLUMN [topic] VARCHAR (250);
ALTER TABLE [dbo].[mqtt_pub] ALTER COLUMN [username] VARCHAR (250);
CREATE UNIQUE CLUSTERED INDEX [dbo].[mqtt_topic] ON [mqtt_pub] (topic)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
go
… and sr_group index/PK was inconsistent with other DBs:
ALTER TABLE [dbo].[sr_group] DROP CONSTRAINT [sr_group_PRIMARY];
CREATE UNIQUE CLUSTERED INDEX [sr_group_name] ON [sr_group] ([name])
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON);
go
ChangeLog
General
New s2s_out_bounce_packet hook
Re-allow anonymous connection for connection without client certificates (#3985)
Stop ejabberd_system_monitor before stopping node
captcha_url option now accepts auto value, and it’s the default
mod_mam: Add support for XEP-0425: Message Moderation
mod_mam_sql: Fix problem with results of mam queries using rsm with max and before
mod_muc_rtbl: New module for Real-Time Block List for MUC rooms (#4017)
mod_roster: Set roster name from XEP-0172, or the stored one (#1611)
mod_roster: Preliminary support to store extra elements in subscription request (#840)
mod_pubsub: Pubsub xdata fields max_item/item_expira/children_max use max not infinity
mod_vcard_xupdate: Invalidate vcard_xupdate cache on all nodes when vcard is updated
Admin
ext_mod: Improve support for loading *.so files from ext_mod dependencies
Improve output in gen_html_doc_for_commands command
The sprint will be all around XMPP and offer the opportunity to meet, present, discuss, but also work on your projects and implementations.
Of course, the event is open to general newcomers, XMPP users and any interested party. If you’re planning to attend, signing up would help us organise things.
Info
Date & Time
Thursday, 22-06-2023 18:00 CEST — Sunday, 25-06-2023 12:00 CEST
Location
CCC Hansestadt Hamburg e.V.
Zeiseweg 9
Viktoria-Kaserne, mittlerer Osten, 1. OG, Raum 2 (mid east building area, 1st floor (= first level above the ground), room 2)
22765 Hamburg
Organisational & Attendance
If you plan to join us, please add yourself to the list of participants.
Adding yourself to the list will help us organise everything - thanks! If you don’t have a wiki account, please reach out via chat (see below).
Chat & Communication
It’s recommended to join the chat and say hello if you are interested: XMPP Chat & WebChat
Feel free to share via Mastodon or Twitter!
The two parks I’m going to review today are also connected by the M2R trail in
addition to the Concord Road Trail, but unlike the previous parks these are
linear parks that are integrated directly into the trails!
Since the linear parks aren’t very large and don’t have much in the way of
ammenities to talk about, we’ll veer outside of our Smyrna focus and discuss a
few other highlights of the Concord Road Trail and the southern portion of the
M2R trail, starting with the Chattahoochee River.
Paces Mill
Amenities:🏞️👟🥾🛶💩🚲
Transportation:🚍🚴🚣
The southern terminus of the M2R trail is at the Chattahoochee River National
Recreation Area’s Paces Mill Unit.
In addition to the paved walking and biking trails, the park has several miles
of unpaved hiking trail, fishing, and of course the river itself.
Dogs are allowed and bags are available near the entrance.
If you head north on the paved Rottenwood Creek Trail you’ll eventually connect
to the Palisades West Trails, the Bob Callan Trail, and the Akers Mill
East Trail, giving you access to one of the largest connected mixed-use trail
systems in the Atlanta area!
If, instead, you head out of the park to the south on the M2R trail you’ll
quickly turn back north into the urban sprawl of the Atlanta suburbs.
In approximately 2km you’ll reach the Cumberland Transfer Center where you can
catch a bus to most anywhere in Cobb, or transfer to MARTA in Atlanta-proper.
At this point the trail also forks for a more direct route to the Silver Comet
Trail using the Silver Comet Cumberland Connector trail.
We may take that trail another day, but for now we’ll continue north on the M2R
trail.
Just a bit further north there are also short connector trails to Cobb Galleria
Center (an exhibit hall and convention center) and The Battery, a mixed-use
development surrounding the Atlanta Braves baseball stadium.
It’s at this point that the trail turns west along Spring Road where it
coincides with the Spring Road Trail that connects to the previously-reviewed
Jonquil Park (a total ride of ~3.7km).
Shortly thereafter we reach our first actual un-reviewed Smyrna park: the Spring
Road Linear Park.
The park does not have a sign or other markers, but does have several nice pull
offs with benches that make a good stop over point on your way home to or from
the buses at the Cumberland Transfer Center.
If you’re out walking the dog public trash cans and dog-poo bags are available
on the east end of the park, but do keep in mind that the main trail is
mixed-use so dogs should be kept on one side of the trail to avoid incidents
with bikes.
After a short climb the trail turns north again and intersects with the Concord
Road Trail and the Atlanta Road Trail.
We could veer just off the trail near this point to reach Durham Park, the
subject of a future review, but instead we’ll continue west, transitioning to
the Concord Road Trail to reach our next park: Concord Road Linear Park.
The Concord Road Linear Park sits in the middle of the mid-century
Smyrna Heights neighborhood and has something special that’s not often found in
poorly designed suburban neighborhoods: (limited) mixed-use zoning!
A restaurant and bar (currently seafood) sits at the edge of the park along with
a bike repair stand and bike parking.
It’s worth commending Smyrna for creating this park at all, it may be small but
in addition to the mixed-use zoning it did something that’s also not often seen
in the burbs: it removed part of Evelyn Street, disconnecting it from the
nearest arterial road!
In the war-on-cars this is a small but important victory that creates a
quality-of-life improvement for everyone in the neighborhood, whether they bike,
walk the dog, or just take a stroll over to the restaurants in the town square
without having to be molested by cars.
In our next review we’ll turn back and continue up the M2R trail to reach a few
other parks, but if we were to continue we’d find that the Concord Road Trail
continues for another 4km until it terminates at the Silver Comet Trail’s
Concord Road Trail Head.
This trail head sits at mile marker 2.6 on the Silver Comet Trail, right by the
Concord Covered Bridge Historic District.
The Silver Comet will likely be covered in future posts, so for now I’ll leave
it there.
Thanks for bearing with me while we take a detour away from the City of Smyrna’s
parks, next time the majority of the post will be about parks within the city, I
promise.
En esta nota exploraremos los aspectos internos de la máquina virtual BEAM o VM por sus siglas en inglés (Virtual Machine). Y haremos una comparación con la máquina virtual de Java, la JVM.
El éxito de cualquier lenguaje de programación en el ecosistema Erlang puede ser repartido a tres componentes estrechamente acoplados:
la semántica del lenguaje de programación Erlang, que es la base sobre la cual otros lenguajes están implementados
las bibliotecas OTP y middleware usados para construir arquitecturas escalabels y sistemas concurrentes y resilientes y
la máquina virtual BEAM, estrechamente acoplada a la semántica del lenguaje y OPT.
Toma cualquiera de estos componentes por si solo y tendras a un potencial ganador. Pero si consideras a los tres juntos, tendrás a un ganador indiscutible para el desarrollo de sistemas escalables, resilientes y soft-real time. Citando a Joe Armstrong:
“Puedes copiar las bibliotecas de Erlang, pero si no corren en la BEAM, no puedes emular la semánticas”
Esta idea es reforzada por la primera regla de programación de Robert Virding, que establece que “Cualquier programa concurrente escrito en otro lenguaje y que sea lo suficientemente complejo, contiene una implementación ad hoc, específicada informalmente, lenta y plagada de errores, de la mitad de Erlang.”
En esta nota vamos a explorar los aspectos internos de la máquina virtual BEAM. Compararemos algunos de ellos con la JVM, señalando las razones por las que deberías poner especial atención en ellos. Por mucho tiempo estos componentes han sido tratados como una caja negra, y confiamos ciegamente en ellos sin entender que hay detrás. Es tiempo de cambiar eso!
Aspectos relevantes de la BEAM
Erlang y la máquina virtual BEAM fueron inventados para tener una herramienta que resolviera un problema específico. Fueron desarrollados por Ericsson para ayudar a implementar la infraestructura de un sistema de telecomunicaciones que manejara redes fijas y móviles. Esta infraestructura por naturaleza es altamente concurrente y escalable. Tiene que funcionar en tiempo real y posiblemente nunca presentar fallas. No queremos que nuestra llamada de Hangouts con nuestra abuela de pronto terminé por un error, o estar en un juego en línea como Fortnite y que se interrumpa porque le tienen que hacer actualizaciones. La máquina virtual BEAM está optimizada para resolver muchos de estos retos, gracias a características que funcionan con un modelo de programación concurrente predecible.
La receta secreta son los procesos Erlang, que son ligeros, no comparten memoria y son administrados por schedulers capaces de manejar millones a través de múltiples procesadores. Utiliza un recolector de basura que corre en un proceso por si mismo y está altamente optimizado para reducir el impacto en otros procesos. Como resultado de esto, el recolector de basura no afecta las propiedades globales en tiempo real del sistema. La BEAM es también la única máquina virtual utilizada ampliamente a escala con un modelo de distribución hecho a la medida, que permite a un programa ejecutarse en múltiples máquinas de manera transparente.
Aspectos relevantes de la JVM
La máquina virtual de Java o JVM por sus siglas en inglés (Java Virtual Machine) fue desarrollada por Sun Microsystem en un intento de proveer un plataforma en la que “escribes código una vez” y corre en donde sea. Crearon un lenguaje orientado a objetos, similar a C++ pero que fuera memory-safe ya que la detección de errores en tiempo de ejecución revisa los límites de los arreglos y las desreferencias de punteros. El ecosistema JVM se volvió extremamente popular en la era del Internet, convirtiéndose un estándar de-facto para el desarrollo de aplicaciones de servidores empresariales. El amplio rango de aplicabilidad fue posible gracias a una máquina virtual que se adapta a muchos casos de uso y a un impresionante conjunto de bibliotecas que se adaptan al desarrollo empresarial.
La JVM fue diseñada pensando en eficiencia. La mayoría de sus conceptos son una abstracción de características encontradas en populares sistemas operativos, como el modelo de hilos, similar al manejo de hilos o threads del sistema operativo. La JVM es altamente personalizable, incluyendo el recolector de basura y class loaders. Algunas implementaciones del recolector de basura de última generación brindan características ajustables que se adaptan a un modelo de programación basado en memoria compartida.
La JVM le permite modificar el código mientras se ejecuta el programa. Y, un compilador JIT permite que el código de bytes se compile en el código de máquina nativo con la intención de acelerar partes de la aplicación.
La concurrencia en el mundo de Java se relaciona principalmente con la ejecución de aplicaciones en subprocesos paralelos, lo que garantiza que sean rápidos. La programación con primitivas de concurrencia es una tarea difícil debido a los desafíos creados por su modelo de memoria compartida. Para superar estas dificultades, existen intentos de simplificar y unificar los modelos de programación concurrentes, como el marco Akka, que es el intento más exitoso.
Concurrencia y Paralelismo
Cuando hablamos de ejecución de código paralela, nos referismo a que partes del código se ejecutan al mismo tiempo en múltiples procesadores, o computadoras, mientras que programación concurrente se refiere al manejo de eventos que llegan de forma independiente. Una ejecución concurrente se puede simular en hardware de un solo subproceso, mientras que la ejecución en paralelo no. Aunque esta distinción puede parecer pedante, la diferencia da como resultado problemas por resolver con enfoques muy diferentes. Piense en muchos cocineros que preparan un plato de pasta carbonara. En el enfoque paralelo, las tareas se dividen entre la cantidad de cocineros disponibles, y una sola parte se completaría tan rápido como les tome a estos cocineros completar sus tareas específicas. En un mundo concurrente, obtendría una porción para cada cocinero, donde cada cocinero hace todas las tareas.
Utilice el paralelismo para velocidad y la concurrencia para escalabilidad.
La ejecución en paralelo intenta resolver una descomposición óptima del problema en partes independientes entre sí. Hervir el agua, sacar la pasta, mezclar el huevo, freír el jamón, rallar el queso. Los datos compartidos (o en nuestro ejemplo, el plato a servir) se manejan mediante bloqueos, mutexes y otras técnicas que garantizan la correcta ejecución. Otra forma de ver esto es que los datos (o ingredientes) están presentes y queremos utilizar tantos recursos de CPU paralelos como sea posible para terminar el trabajo lo más rápido que se pueda.
La programación concurrente, por otro lado, trata con muchos eventos que llegan al sistema en diferentes momentos y trata de procesarlos todos dentro de un tiempo razonable. En arquitecturas multi-procesadores o distribuidas, parte de la ejecución se lleva a cabo en paralelo, pero esto no es un requisito. Otra forma de verlo es que el mismo cocinero hierve el agua, saca la pasta, mezcla los huevos, etc., siguiendo un algoritmo secuencial que es siempre el mismo. Lo que cambia entre procesos (o cocciones) son los datos (o ingredientes) en los que trabajar, que existen en múltiples instancias.
La JVM está diseñada para el paralelismo, la BEAM para la concurrencia. Son dos problemas intrínsecamente diferentes, que requieren soluciones diferentes.
La BEAM y la concurrencia
La BEAM proporciona procesos ligeros para dar contexto al código en ejecución. Estos procesos, también llamados actores, no comparten memoria, sino que se comunican a través del paso de mensajes, copiando datos de un proceso a otro. El paso de mensajes es una característica que la máquina virtual implementa a través de buzones de correo que tienen los procesos individualmente. El paso de mensajes es una operación no-bloqueante, lo que significa que enviar un mensaje de un proceso a otro otro es casi instantáneo y la ejecución del remitente no se bloquea. Los mensajes enviados tienen la forma de datos inmutables, copiados de la pila del proceso remitente al buzón del proceso receptor. Esto se logra sin necesidad de bloqueos y mutexes entre los procesos, el único bloqueo en el buzón o mailbox es en caso de que varios procesos envíen un mensaje al mismo destinatario en paralelo.
Los datos inmutables y el paso de mensajes permiten al programador escribir procesos que funcionan de forma independiente y que se centran en la funcionalidad en lugar del manejo de bajo nivel de la memoria y la programación de tareas. Este diseño simple no solo funciona en un solo proceso, sino también en múltiples threads en una máquina local que se ejecuta en la misma VM y utilizando la distribución integrada, a través de la red con un grupo de VMs y máquinas. Si los mensajes son inmutables entre procesos, se pueden enviar a otro subproceso (o máquina) sin bloqueos, escalando casi linealmente en arquitecturas distribuidas de varios procesadores. Los procesos se abordan de la misma manera en una VM local que en un clúster de VM, el envío de mensajes funciona de manera transparente, independientemente de la ubicación del proceso de recepción.
Los procesos no comparten memoria, permitiendo replicar los datos para mayor resiliencia, y distribuirlos para escalar. Esto significa que se pueden tener dos instancias del mismo proceso en dos máquinas separadas, compartiendo una actualización de estado entre ellas. Si una máquina falla entonces la otra tiene una copia de los datos y puede continuar manejando la solicitud, haciendo el sistema tolerante a fallas. Si ambas máquinas están operando, ambos procesos pueden manejar peticiones, brindando así escalabilidad. La BEAM proporcional primitivas altamente optmizadas para que todo esto funcione sin problemas, mientras que OTP (la biblioteca estándar) proporciona las construcciones de nivel superior para facilitar la vida de los programadores.
Akka hace un gran trabajo al replicar las construcciones de nivel superior, pero esta de alguna manera limitado por la falta de primitivas proporcionadas por la JVm, permitiendo estar altamente optimizada para concurrencia. Si bien las primitivas de la JVM permiten una gama más amplia de casos de uso, hacen el desarrollo de sistemas distribuidos más complicado al no tener características por default para la comunicación y a menudo se basan en un modelo de memoria compartida. Por ejemplo, ¿en qué parte de un sistema distribuido coloca memoria compartida? ¿Y cuál es el costo de acceder a ella?
Scheduler
Mencionamos que una de las características más fuertes de la BEAM es la capacidad de dividir un programa en procesos pequeños y livianos. La gestión de estos procesos es tarea del scheduler. A diferencia de la JVM, que asigna sus subprocesos a threads del sistema operativo y deja que este los administre, la BEAM viene con su propio scheduler o administrador.
El scheduler inicia por default un hilo del sistema operativo (OS thread)por cada procesador de la máquina y optimiza la carga entre ellos. Cada proceso consiste en código que será ejecutado y un estado que cambia con el tiempo. El scheduler escoge el primer proceso en la cola de ejecución que esté listo para correr, le asigna una cierta cantidad de reducciones para ejecutarse, donde cada reducción es el equivalente aproximado a un comando. Una vez que el proceso se ha quedado sin reducciones, sera bloqueado por I/O, y se queda esperando un mensaje o que pueda completar su ejecución, el scheduler escoge el siguiente proceso en la cola y lo despacha. Esta técnica es llamada preventiva.
Mencionamos el framework Akka varias veces, ya que su más grande desventaja es la necesidad de anotar el código con scheduling points, ya que la administración no está dada a nivel de la JVM. Al quitar el control de las manos del programador, las propiedades en tiempo real son preservadas y garantizadas, ya que no hay riesgo de que accidentalmente se provoque inanición del proceso.
Los procesos pueden ser esparcidos a todos los hilos disponiblews del scheduler y maximizar el uso de CPU. Hay muchas maneras de modificar el scheduler pero es raro y solo será requerido en ciertos casos límite, ya que las opciones predeterminadas cubren la mayoría de los patrones de uso.
Hay un tema sensible que aparece con frequencia con respecto a los schedulers: como manejar funciones implementadas nativamente, o NIFs por sus siglas en inglés (Natively Implemented Functions). Un NIF es un fragmento de código escrito en C, compilado como una biblioteca y ejecutado en el mismo espacio de memoria que la BEAM para mayor velocidad. El problema con los NIF es que no son preventivos y pueden afectar a los schedulers. En versiones recientes de la BEAM, se agregó una nueva función, dirty schedulers, para brindar un mejor control de los NIF. Los dirty schedulers son schedulers separados que se ejecutan en diferentes subprocesos para minimizar la interrupción que puede causar un NIF en un sistema. La palabra dirty se refiere a la naturaleza del código que ejecutan estos schedulers.
Recolector de Basura
Los lenguajes de programación modernos utilizan un recolector de basura para el manejo de memoria. Los lenguajes en la BEAM no son la excepción. Confiar en la máquina virtual para manejar los recursos y administrar la memoria es muy útil cuando desea escribir código concurrente de alto nivel, ya que simplifica la tarea. La implementación subyacente del recolector de basura es bastante sencilla y eficiente, gracias al modelo de memoria basado en el estado inmutable. Los datos se copian, no se modifican y el hecho de que los procesos no compartan memoria elimina las interdependencias de los procesos que por consiguiente, no necesitan ser administradas.
Otra característica del recolecto de basura de la BEAM es que solamente se ejecuta cuando es necesario, en un proceso por si solo, sin afectar otros procesos esperando en la cola de ejecución. Por lo tanto, el recolector de basura en Erlang no detine el mundo. Evita picos de latencia en el procesamiento, porque la máquina virtual nunca se detiene como un todo, solo se detienen procesos específicos, y nunca todos al mismo tiempo. En la práctica, es solo parte de lo que hace un proceso y se trata como otra reducción. El recolector de basura suspende el proceso por un intervalo muy corto, hablamos de microsegundos. Como resultado, habrá muchas ráfagas pequeñas, que se activarán solo cuando el proceso necesite más memoria. Un solo proceso generalmente no tiene asignadas grandes cantidades de memoria y, a menudo, es de corta duración, lo que reduce aún más el impacto al liberar inmediatamente toda su memoria. Una característica de la JVM es la capacidad de intercambiar recolectores de basura, así que al usar un recolector comercial, también es posible lograr un recolector continuo o non-stopping en la JVM.
Las características del recolector de basura son discutidas en este excelente post por Lukas Larsson. Hay muchos detalles intrincados, pero está optimizada para manejar datos inmutables de manera eficiente, dividiendo los datos entre la pila y el heap para cada proceso. El mejor enfoque es hacer la mayor parte del trabajo en procesos de corta duración.
Una pregunta que surge a menudo sobre este tema es cuánta memoria usa la BEAM. Si indgamos un poco, la máquina virtual asigna grandes porciones de memoria y utiliza allocators personalizados para almacenar los datos de manera eficiente y minimizar la sobrecarga de las llamadas al sistema. Esto tiene dos efectos visibles:
1) La memoria utilizada disminuye gradualmente después de que no se necesita espacio
2) La reasignación de grandes cantidades de datos podría significar duplicar la memoria de trabajo actual.
El primer efecto puede, si es realmente necesario, mitigarse ajustando las estrategias del allocator. El segundo es fácil de monitorear y planificar si tiene visibilidad de los diferentes tipos de uso de la memoria. (Una de esas herramientas de monitoreo que proporciona métricas del sistema listas para usar es WombatOAM).
Hot Code Loading
La carga de código en caliente o hot code loading es probablemente la característica única más citada de BEAM. La carga de código en caliente significa que la lógica de la aplicación se puede actualizar cambiando el código ejecutable en el sistema mientras se conserva el estado del proceso interno. Esto se logra reemplazando los archivos BEAM cargados e instruyendo a la máquina virtual para que reemplace las referencias del código en los procesos en ejecución.
Es una característica fundamental para garantizar que no habrá tiempo de inactividad en una infraestructura de telecomunicaciones, donde se utilizó hardware redundante para manejar los picos. Hoy en día, en la era de la contenerización, también se utilizan otras técnicas hacer actualizaciones a un sistema en producción. Aquellos que nunca lo han usado o requerido, lo descartan como una característica no tan importante, pero no hay que subestimarla en el flujo de trabajo de desarrollo. Los desarrolladores pueden iterar más rápido reemplazando parte de su código sin tener que reiniciar el sistema para probarlo. Incluso si la aplicación no está diseñada para ser actualizable en producción, esto puede reducir el tiempo necesario para volver a compilar y re-lanzar el sistema.
Cuando no usar la BEAM
Se trata en gran medida de saber escoger la herramienta adecuada para el trabajo.
¿Necesita un sistema que sea extremadamente rápido, pero no le preocupa la concurrencia? ¿Quiere manejar algunos eventos en paralelo de manera rápida? ¿Necesita procesar números para gráficos, IA o análisis? Siga la ruta de C++, Python o Java. La infraestructura de telecomunicaciones no necesita operaciones rápidas con floats, por lo que la velocidad nunca fue una prioridad. Con el tipado dinámico, que tiene que hacer todas las comprobaciones de tipo en tiempo de ejecución, las optimizaciones en el compilador no son tan triviales. Por lo tanto, es mejor dejar el procesamiento de números en manos de la JVM, Go u otros lenguajes que compilan de forma nativa. No es de sorprender que las operaciones de coma flotante en Erjang, la versión de Erlang que se ejecuta en la JVM, sean un 5000 % más rápidas que en la BEAM. Por otro lado, en donde hemos visto realmente brillar a la BEAM es en el uso de su concurrencia para orquestar el procesamiento de números, subcontratando el análisis a C, Julia, Python o Rust. Haces el mapa fuera de la BEAM y la reducción dentro de ella.
El mantra siempre ha sido lo suficientemente rápido. Los humanos tardan unos cientos de milisegundos en percibir un estímulo (un evento) y procesarlo en su cerebro, lo que significa que el tiempo de respuesta de micro o nano segundos no es necesario para muchas aplicaciones. Tampoco es recomendable usar la BEAM para microcontroladores, ya que consume demasiados recursos. Pero para los sistemas integrados con un poco más de potencia de procesamiento, donde los multi-procesadores se están convirtiendo en la norma, se necesita concurrencia y la BEAM brilla ahí. En los años 90, estábamos implementando conmutadores de telefonía que manejaban decenas de miles de suscriptores que se ejecutaban en placas integradas con 16 MB de memoria. ¿Cuánta memoria tiene un RaspberryPi en estos días?
Y por último, hard-real-time. Probablemente no quiera que la BEAM administre su sistema de control de bolsas de aire. Necesita garantías sólidas, un sistema operativo en tiempo real y un lenguaje sin recolección de basura ni excepciones. Una implementación de una máquina virtual de Erlang que se ejecuta en el metal, como GRiSP, le brindará garantías similares.
Conclusión
Utilice la herramienta adecuada para el trabajo.
Si está escribiendo un sistema soft-real time que tiene que escalar fuera de la caja y nunca fallar, y hacerlo sin la molestia de tener que reinventar la rueda, definitivamente la BEAM es la tecnología que está buscando. Para muchos, funciona como una caja negra. No saber cómo funciona sería como conducir un Ferrari y no ser capaz de lograr un rendimiento óptimo o no entender de qué parte del motor proviene ese extraño sonido. Es por eso que es importante aprender más sobre la BEAM, comprender su funcionamiento interno y estar listo para ajustarlo. Para aquellos que han usado Erlang y Elixir con ira, hemos lanzado un curso dirigido por un instructor de un día que desmitificará y explicará mucho de lo que vio mientras lo prepara para manejar la concurrencia masiva a escala. El curso está disponible a través de nuestra nueva capacitación remota dirigida por un instructor; obtenga más información aquí. También recomendamos el libro The BEAM de Erik Stenman y BEAM Wisdoms, una colección de artículos de Dmytro Lytovchenko.
We are preparing for the first-ever Google Play Store launch of Cheogram Android as part of JMP coming out of beta later this year. One of the things we wanted to “just work” for Google Play users is to be able to pay for the app and get their first month of JMP “bundled” into that purchase price, to smooth the common onboarding experience. So how do the JMP servers know that the app communicating with them is running a version of the app bought from Google Play as opposed to our builds, F-Droid’s builds, or someone’s own builds? And also ensure that this person hasn’t already got a bundled month before? The documentation available on how to do this is surprisingly sparse, so let’s do this together.
Client Side
Google publishes an official Licensing Verification Library for communicating with Google Play from inside an Android app to determine if this install of the app can be associated with a Google Play purchase. Most existing documentation focuses on using this library, however it does not expose anything in the callbacks other than “yes license verified” or “no, not verified”. This can allow an app to check if it is a purchased copy itself, but is not so useful for communicating that proof onward to a server. The library also contains some exciting snippets like:
// Base64 encoded -
// com.android.vending.licensing.ILicensingService
// Consider encoding this in another way in your
// code to imp rove security
Base64.decode(
"Y29tLmFuZHJvaWQudmVuZGluZy5saWNlbnNpbmcuSUxpY2Vuc2luZ1NlcnZpY2U=")))
Which implies that they expect developers to fork this code to use it. Digging in to the code we find in LicenseValidator.java:
public void verify(PublicKey publicKey, int responseCode, String signedData, String signature)
Which looks like exactly what we need: the actual signed assertion from Google Play and the signature! So we just need a small patch to pass those along to the callback as well as the response code currently being passed. Then we can use the excellent jitpack to include the forked library in our app:
Then we write a small class in our app code to actually use it:
import android.content.Context;
import com.google.android.vending.licensing.*;
import java.util.function.BiConsumer;
public class CheogramLicenseChecker implements LicenseCheckerCallback {
private final LicenseChecker mChecker;
private final BiConsumer mCallback;
public CheogramLicenseChecker(Context context, BiConsumer<String, String> callback) {
mChecker = new LicenseChecker(
context,
new StrictPolicy(), // Want to get a signed item every time
context.getResources().getString(R.string.licensePublicKey)
);
mCallback = callback;
}
public void checkLicense() {
mChecker.checkAccess(this);
}
@Override
public void dontAllow(int reason) {
mCallback.accept(null, null);
}
@Override
public void applicationError(int errorCode) {
mCallback.accept(null, null);
}
@Override
public void allow(int reason, ResponseData data, String signedData, String signature) {
mCallback.accept(signedData, signature);
}
}
Here we use the StrictPolicy from the License Verification Library because we want to get a fresh signed data every time, and if the device is offline the whole question is moot because we won’t be able to contact the server anyway.
This code assumes you put the Base64 encoded licensing public key from “Monetisation Setup” in Play Console into a resource R.string.licensePublicKey.
Then we need to communicate this to the server, which you can do whatever way makes sense for your protocol; with XMPP we can easily add custom elements to our existing requests so:
When trying to verify this on the server side we quickly run into some new issues. What format is this public key in? It just says “public key” and is Base64 but that’s about it. What signature algorithm is used for the signed data? What is the format of the data itself? Back to the library code!
private static final String KEY_FACTORY_ALGORITHM = "RSA";
…
byte[] decodedKey = Base64.decode(encodedPublicKey);
…
new X509EncodedKeySpec(decodedKey)
So we can see it is an X509 related encoded, and indeed turns out to be Base64 encoded DER. So we can run this:
to get the raw properties we might need for any library (key size, modulus, and exponent). Of course, if your library supports parsing DER directly you can also use that.
import java.security.Signature;
…
private static final String SIGNATURE_ALGORITHM = "SHA1withRSA";
…
Signature sig = Signature.getInstance(SIGNATURE_ALGORITHM);
sig.initVerify(publicKey);
sig.update(signedData.getBytes());
Combined with the java documentation we can thus say that the signature algoritm is PKCS#1 padded RSA with SHA1.
The format of the data, pipe-seperated text. The main field of interest for us is userId which is (as it says in a comment) “a user identifier unique to the <application, user> pair”. So in our server code:
We can then use the verified and extracted googlePlayUserId value to check if this user has got a bundled month before and, if not, to provide them with one during signup.
As a backend developer, I’ve spent most of my programming career away from frontend development. Whether it’s React/Elm for the web or Swift/Kotlin for mobile, these are fields of knowledge that fall outside of what I usually work with.
Nonetheless, I always wanted to have a tool at my disposal for building rich frontends. While the web seemed like the platform with the lowest bar of entry for this, the size of the Javascript ecosystem had become so vast that familiarizing oneself with it was no small task.
This is why I got very excited when Chris McCord first showed LiveView to the world. Building interactive frontends, with no Javascript required? This sounded like it was made for all of us Elixir backend developers that were “frontend curious”.
However, if you haven’t already jumped into it, you might be hesitant to start. After all: it’s often not just about learning LiveView as if you were writing a greenfield project, but about how you would add LiveView into that Phoenix app that you’re already working on.
Therefore, throughout this guide, I’ll presume that you already have an existing project that you wish to integrate LiveView into. If you have the luxury of a clean slate, then other resources (such as the Programming Phoenix LiveView book, by Bruce A. Tate and Sophie DeBenedetto) may be of more use.
I hope that this article may serve you well as a starting point!
Will it work for my use case?
You might have some worries about whether LiveView is a technology that you can introduce to your application. After all: no team likes to adopt a technology that they later figure out does not suit their use case.
There are some properties of LiveView which are inherent to the technology, and therefore must be considered:
Offline mode
The biggest question is whether you need an offline mode for your application. My guess is that you probably do not need it, but if you do, LiveView is not the technology for you. The reason for this is that LiveView is rendered on the backend, necessitating communication with it.
Latency
The second biggest question: do you expect the latency from your clients to the server to be high, and would it being high be a serious detriment to your application?
“Certain use cases and experiences demand zero-latency, as well as offline capabilities. This is where Javascript frameworks like React, Ember, etc., shine.”
Almost every interaction with a LiveView interface will send a request to the server; while requests will have highly optimized payloads, if you expect the average round trip from client to server to be too many milliseconds, then the user experience will suffer. LiveView ships with tools for testing your application with increased latency, but if you already know that there’s a certain latency maximum that your clients must not but very likely would exceed, then LiveView may not be suitable.
If these are not of concern to your use case, then let’s get going!
What does it take for me to start?
Phoenix setup
First of all, you’ll want to have a recent version of Phoenix, and your code up-to-date. Following are upgrade guides for older projects:
The guide is rather straight-forward, so I will not reiterate its contents here. The only comment I’ll add is that the section at the very end about adding a topbar is (as the documentation points out) optional. It should be said, however, that this is added by default in new LiveView projects, so if you want to have a setup that’s as close to a freshly generated project, you should include this.
At this point, you should have everything ready for introducing your own LiveView code!
Quick LiveView overview
Before we get to the actual coding, let’s get at a quick overview of the life cycle of a LiveView page. Here’s a high-level overview:
The first request made to a LiveView route will be a plain HTTP request. The router will invoke a LiveView module, which calls the mount/3 function and then the render/1 function. This will render a static page (SEO-friendly out-of-the-box, by the way!), with the required Javascript for LiveView to work. The page then opens a WebSocket connection between the client and the server.
After the WebSocket connection has been established, we get into the LiveView life cycle:
Note that mount/3 and render/1 will be called again, this time over the WebSocket connection. While this probably will not be something you need to worry about when writing your first LiveView pages, it might be of relevance to know that this is the case (discussion about this can be read here). If you have a very expensive function call to make, and you only want to do it once, consider using the connected?/1 function.
After render/1 has been called a second time, we get into the LiveView loop: wait for events, send the events over the wire, change the state on the server, then send back the minimal required data for updating the page on the client.
Let’s now see how we’ll need to change your code to get to this LiveView flow.
Making things live
Now you might be asking:
“OK, so the basics have been set up. What are the bare minimum things to get a page to be live?”
You’ll need to do the following things:
Convert an existing route to a live one
Convert the controller module into a live module
Modify the templates
Introduce liveness
Let’s go over them, one by one:
Bringing life to the dead
Here’s a question I once had, that you might be wondering:
If I’ve got a regular (“dead”) Phoenix route, can I just add something live to a portion of the page, on the existing “dead” route?
Considering how LiveView works, I’d like to transform the question into two new (slightly different) questions:
Can one preserve the current routes and controllers, having them execute live code?
Can one express the live interactions in the dead controllers?
The answer to the first question: yes, but generally you won’t. You won’t, because of the answer to the second question: no, you’ll need separate live modules to express the live interactions.
This leads to an important point:
If you want some part of a page to be live, then your whole page has to be live.
Technically, you can have the route be something else than live (e.g. a get route), and you would then use Phoenix.LiveView.Controller.live_render/3 in a “dead” controller function to render a LiveView module. This does still mean, however, that the page (the logic and templates) will be defined by the live module. You’re not “adding something live to a portion of the dead page”, but rather delegating to a live module from a dead route; you’ll still have to migrate the logic and templates to the live module.
Therefore, your live code will be in LiveView modules (instead of your current controller modules), invoked by live routes. As a sidenote: while it’s not covered by this article, you’ll eventually group live routes with live_session/3, enabling redirects between routes without full page reloads.
Introducing a live route
Many tutorials and videos about LiveView use the example of programming a continuously updating rendering of a thermostat. Let’s therefore presume that you’ve got a home automation application, and up until now you had to go to /thermostats and refresh the page to get the latest data.
The router.ex might look something like this:
defmodule HomeAutomationWeb.Router do
use HomeAutomationWeb, :router
pipeline :browser do
# ...
end
pipeline :logged_in do
# ...
end
scope "/", HomeAutomationWeb do
pipe_through [:browser, :logged_in]
# ...
resources "/thermostats", ThermostatController
post "/thermostats/reboot", ThermostatController, :reboot
end
end
This is a rather simple router (with some lines removed for brevity), but you can probably figure out how this compares to your code. We’re using a call to Phoenix.Router.resources/2 here to cover a standard set of CRUD actions; your set of actions could be different.
Let’s introduce the following route after the post-route:
live "/live/thermostats", ThermostatLive
The ThermostatLive will be the module to which we’ll be migrating logic from ThermostatController.
Creating a live module to migrate to
Creating a skeleton
Let’s start by creating a directory for LiveView modules, then create an empty thermostat_live.ex in that directory.
It might seem a bit strange to create a dedicated directory for the live modules, considering that the dead parts of your application already have controller/template/view directories. This convention, however, allows one to make use of the following feature from the Phoenix.LiveView.render/1 callback (slight changes by me, for readability):
If you don’t define [render/1 in your LiveView module], LiveView will attempt to render a template in the same directory as your LiveView. For example, if you have a LiveView named MyApp.MyCustomView inside lib/my_app/live_views/my_custom_view.ex, Phoenix will look for a template at lib/my_app/live_views/my_custom_view.html.heex.
This means that it’s common for LiveView projects to have a live directory with file pairs, such as foobar.ex and foobar.html.heex, i.e. module and corresponding template. Whether you inline your template in the render/1 function or put it in a dedicated file is up to you.
Open the lib/home_automation_web/live/thermostat_live.ex file, and add the following skeleton of the ThermostatLive module:
defmodule HomeAutomationWeb.ThermostatLive do
use HomeAutomationWeb, :live_view
def mount(_params, _session, socket) do
{:ok, socket}
end
def render(assigns) do
~H"""
<div id="thermostats">
<p>Thermostats</p>
</div>
"""
end
end
There are two mandatory callbacks in a LiveView module: mount/3, and render/1. As mentioned earlier, you can leave out render/1 if you have a template file with the right file name. You can also leave out the mount/3, but that would mean that you neither want to set any state, nor do any work on mount, which is unlikely.
Migrating mount logic
Let’s now look at our imagined HomeAutomationWeb.ThermostatController, to see what we’ll be transferring over to ThermostatLive:
defmodule HomeAutomationWeb.ThermostatController do
use HomeAutomationWeb, :controller
alias HomeAutomation.Thermostat
def index(conn, _params) do
thermostats = Thermostat.all_for_user(conn.assigns.current_user)
render(conn, :index, thermostats: thermostats)
end
# ...
def reboot(conn, %{"id" => id}) do
{:ok, thermostat} =
id
|> Thermostat.get!()
|> Thermostat.reboot()
conn
|> put_flash(:info, "Thermostat '#{thermostat.room_name}' rebooted.")
|> redirect(to: Routes.thermostat_path(conn, :index))
end
end
We’ll be porting a subset of the functions that are present in the controller module: index/2 and reboot/2. This is mostly to have two somewhat different controller actions to work with.
Let’s first focus on the index/2 function. We could imagine that Thermostat.all_for_user/1 makes a database call of some kind, possibly with Ecto. conn.assigns.current_user would be added to the assigns by the logged_in Plug in the pipeline in the router.
Let’s naively move over the ThermostatController.index/2 logic to the LiveView module, and take it from there:
defmodule HomeAutomationWeb.ThermostatLive do
use HomeAutomationWeb, :live_view
alias HomeAutomation.Thermostat
def mount(_params, _session, socket) do
thermostats = Thermostat.all_for_user(socket.assigns.current_user)
{:ok, assign(socket, %{thermostats: thermostats})}
end
def render(assigns) do
~H"""
<div id="thermostats">
<p>Thermostats</p>
</div>
"""
end
end
Firstly, we’re inserting the index/2 logic into the mount/3 function of ThermostatLive, meaning that the data will be called for on page load.
Secondly, notice that we changed the argument to Thermostat.all_for_user/1 from conn.assigns.current_user to socket.assigns.current_user. This is just a change of variable name, of course, but it signifies a change in the underlying data structure: you’re not working with a Plug.Conn struct, but rather with a Phoenix.LiveView.Socket.
So far we’ve written some sample template code inside the render/1 function definition, and we haven’t seen the actual templates that would render the thermostats, so let’s get to those.
Creating live templates
Let’s presume that you have a rather simple index page, listing all of your thermostats.
<h1>Listing Thermostats</h1>
<%= for thermostat <- @thermostats do %>
<div class="thermostat">
<div class="row">
<div class="column">
<ul>
<li>Room name: <%= thermostat.room_name %></li>
<li>Temperature: <%= thermostat.temperature %></li>
</ul>
</div>
<div class="column">
Actions: <%= link("Show", to: Routes.thermostat_path(@conn, :show, thermostat)) %>
<%= link("Edit", to: Routes.thermostat_path(@conn, :edit, thermostat)) %>
<%= link("Delete",
to: Routes.thermostat_path(@conn, :delete, thermostat),
method: :delete,
data: [confirm: "Are you sure?"]
) %>
</div>
<div class="column">
<%= form_for %{}, Routes.thermostat_path(@conn, :reboot), fn f -> %>
<%= hidden_input(f, :id, value: thermostat.id) %>
<%= submit("Reboot", class: "rounded-full") %>
<% end %>
</div>
</div>
</div>
<% end %>
<%= link("New Thermostat", to: Routes.thermostat_path(@conn, :new)) %>
Each listed thermostat has the standard resource links of Show/Edit/Delete, with a New-link at the very end of the page. The only thing that goes beyond the usual CRUD actions is the form_for, defining a Reboot-button. The Reboot-button will initiate a request to the POST /thermostats/reboot route.
As previously mentioned, we can either move this template code into the ThermostatLive.render/1 function, or we can create a template file named lib/home_automation_web/live/thermostat_live.html.heex. To get used to the new ways of LiveView, let’s put the code into the render/1 function. You can always extract it later (but remember to delete the render/1 function, if you do!).
The first step would be to simply copy paste everything, with the small change that you need to replace every instance of @conn with @socket. Here’s what the ThermostatLive will look like:
defmodule HomeAutomationWeb.ThermostatLive do
use HomeAutomationWeb, :live_view
alias HomeAutomation.Thermostat
def mount(_params, _session, socket) do
thermostats = Thermostat.all_for_user(socket.assigns.current_user)
{:ok, assign(socket, %{thermostats: thermostats})}
end
def render(assigns) do
~H"""
<h1>Listing Thermostats</h1>
<%= for thermostat <- @thermostats do %>
<div class="thermostat">
<div class="row">
<div class="column">
<ul>
<li>Room name: <%= thermostat.room_name %></li>
<li>Temperature: <%= thermostat.temperature %></li>
</ul>
</div>
<div class="column">
Actions: <%= link("Show", to: Routes.thermostat_path(@socket, :show, thermostat)) %>
<%= link("Edit", to: Routes.thermostat_path(@socket, :edit, thermostat)) %>
<%= link("Delete",
to: Routes.thermostat_path(@socket, :delete, thermostat),
method: :delete,
data: [confirm: "Are you sure?"]
) %>
</div>
<div class="column">
<%= form_for %{}, Routes.thermostat_path(@socket, :reboot), fn f -> %>
<%= hidden_input(f, :id, value: thermostat.id) %>
<%= submit("Reboot", class: "rounded-full") %>
<% end %>
</div>
</div>
</div>
<% end %>
<%= link("New Thermostat", to: Routes.thermostat_path(@socket, :new)) %>
"""
end
end
While this makes the page render, both the links and the form are doing the same “dead” navigation as before, leading to full-page reloads, not to mention that we currently get out from the live page.
To make the page more live, let’s focus on making the clicking of the Reboot-button result in a LiveView event, instead of a regular POST with subsequent redirect.
Changing the button to something live
The Reboot-button is a good target to turn live, as it should just fire an asynchronous event, without redirecting anywhere. Let’s have a look at how the button is currently defined:
The reason why the “dead” template used a form_for with a submit is two-fold. Firstly, since the action of rebooting the thermostat is not a navigation action, using an anchor tag (<a>) styled to look like a button would not be appropriate: using a form with a submit button is better, since it indicates that an action will be performed, and the action is clearly defined by the form’s method and action attributes. Secondly, a form allows you to include a CSRF token, which is automatically injected into the resulting <form> with form_for.
Let’s look at what the live version will look like:
<%= link("Reboot",
to: "#",
phx_click: "reboot",
phx_value_id: thermostat.id,
data: [confirm: "Are you sure?"]
) %>
The detail above about navigation technically still applies, but in LiveView one would (generally) use a link with to: “#” for most things functioning like a button.
As a minor note: you’ll still be using forms in LiveView for data input, although you’ll be using the <.form> component, instead of calling form_for.
The phx_click event
The second thing to note is that is the phx_click attribute, and it’s value “reboot”. The key is indicating what event should be fired when interacting with the generated <a> tag. The various possible event bindings can be found here: https://hexdocs.pm/phoenix_live_view/bindings.html
If you want to have a reference for what events you can work with in LiveView, the link above is a good one to bookmark!
Clarifying a potentially confusing detail: the events listed in the above linked documentation use hyphens (-) as separators in their names. link uses underscores (_), but apart from this, the event names are the same.
The “reboot” string specifies the “name” of the event that is sent to the server. We’ll see the usage of this string in a second.
The value attribute
Finally, let’s talk about the phx_value_id attribute. phx_value_id is special, in that part of the attribute name is user defined. The phx_value_-part of the attribute name indicates to LiveView that the attribute is an “event value”, and what follows after phx_value_ (in our case: id) will be the key name in the resulting “event data map” on the server side. The value of the attribute will become the value in the map.
This means that this…:
phx_value_id: "thermostat_13",
…will be received as the following on the server:
%{id: "thermostat_13"}
Further explanation can be found in the documentation:
Adding the corresponding event to the LiveView module
Now that we’ve changed the Reboot-button in the template, we can get to the final step: amending the ThermostatLive module to react to the “reboot” event. We need to add a handle_event function to the module, and we’ll use the logic that we saw earlier in ThermostatController.reboot/2:
defmodule HomeAutomationWeb.ThermostatLive do
use HomeAutomationWeb, :live_view
alias HomeAutomation.Thermostat
def mount(_params, _session, socket) do
# ...
end
def handle_event("reboot", %{"id" => id}, socket) do
{:ok, thermostat} =
id
|> Thermostat.get!()
|> Thermostat.reboot()
{:noreply,
put_flash(
socket,
:info,
"Thermostat '#{thermostat.room_name}' rebooted."
)}
end
def render(assigns) do
# ...
end
end
This handle_event function will react to the “reboot” event. The first argument to the function is the event name, the second is any passed data (through phx-value-*), and finally the socket.
A quick note about the :noreply: presume that you’ll be using {:noreply, socket}, as the alternative ({:reply, map, socket}) is rarely useful. Just don’t worry about this, for now.
That’s it!
If you’ve been following this guide, trying to adapt it to your application, then you should have something like the following:
A live route.
A live module, where you’ve ported some of the logic from the controller module.
A template that’s been adapted to be rendered by a live module.
An element on the page that, when interacted with, causes an event to fire, with no need for a page refresh.
At this stage, one would probably want to address the other CRUD actions, at the very least having their navigation point to the live route, e.g. creating a new thermostat should not result in a redirect to the dead route. Even better would be to have the CRUD actions all be changed to be fully live, requiring no page reloads. However, this is unfortunately outside of the scope of this guide.
I hope that this guide has helped you to take your first steps toward working with LiveView!
Further reading
Here’s some closing advice that you might find useful, if you want to continue on your own.
Exploring generators
A very educative thing to do is comparing what code Phoenix generates for “dead” pages vs. live pages.
Following are the commands for first generating a “dead” CRUD page setup for a context (Devices) and entity (Thermostat), and then one generates the same context and entity, but in a live fashion. The resulting git commits illustrate how the same intent is expressed in the two styles.
$ mix phx.new home_automation --live
$ cd home_automation
$ git init .
$ git add .
$ git commit -m "Initial commit"
$ mix phx.gen.html Devices Thermostat thermostats room_name:string temperature:integer
$ git add .
$ git commit -m "Added Devices context with Thermostat entity"
$ git show
$ mix phx.gen.live Devices Thermostat thermostats room_name:string temperature:integer
$ git add .
$ git commit -m "Added Live version of Devices with Thermostat"
$ git show
Note that when you get to the phx.gen.live step, you’ll have to answer Y to a couple of questions, as you’ll be overwriting some code. Also, you’ll generate a superfluous Ecto migration, which you can ignore.
Study these generated commits, the resulting files, and the difference between the generated approaches, as it helps a lot with understanding how the transition from dead to live is done.
Broadcasting events
You might want your live module to react to specific events in your application. In the case of the thermostat application it could be the change of temperature on any of the thermostats, or the reboot status getting updated asynchronously. In the case of a LiveView chat application, it would be receiving a new message from someone in the conversation.
A very commonly used method for generating and listening to events is making use of Phoenix.PubSub. Not only is Phoenix.PubSub a robust solution for broadcasting events, it gets pulled in as a dependency to Phoenix, so you should already have the hex installed.
Regarding HTTP verbs, coming from the world of dead routes, you might be wondering:
I’ve got various GET/POST/PUT/etc. routes that serve different purposes. When building live modules, do all of the routes (with their different HTTP verbs) just get replaced with live?
Yes, mostly. Generally your live parts of the application will handle their communication over the WebSocket connection, sending various events. This means that any kind of meaning you wish to communicate through the various HTTP verbs will instead be communicated through various events instead.
With that said, you may still have parts of your application that will still be accessed with regular HTTP requests, which would be a reason to keep these routes around. The will not, however, be called from your live components.
Credits
Last year, Stone Filipczak wrote an excellent guide on the SmartLogic blog, on how to quickly introduce LiveView to an existing phoenix app. It was difficult to not have overlap with that guide, so my intention has been to complement it. Either way, I encourage you to check it out!