Maven Imported 1.12 Million Fediverse Posts (Updated)

Here we go again...

💡 This article has been updated.

A recent investigation by Liaizon Wakest revealed that Maven, a new social network founded by former OpenAI Team Lead Ken Stanley, has been importing a vast amount of posts from the Fediverse without anyone’s consent. Additionally, it’s pulling in Bluesky statuses connected via Bridgy Fed.

In addition to pulling in posts, the import process seems to be running AI sentiment analysis to add tags and relational data after content reaches Maven’s servers. This is a core part of Maven’s product: instead of follows or likes, a model trains itself on its own data in an attempt to surface unique content algorithmically.

It’s worth mentioning that Maven received 2 million dollars in funding from former Twitter CEO Ev Williams and OpenAI CEO Sam Altman. While I have little input on Ev Williams, the relationship between Maven and OpenAI could be seen as more than a little problematic, as funding could give both parties greater incentive for Maven to adopt OpenAI’s technologies and policies.

What’s Going On?

Digging into the situation, it looks like Maven is working on their own ActivityPub implementation. Jimmy Secretan, Maven’s CTO, confirmed this in a post.

I just can’t keep any secrets around here can I 😀? As you mentioned, we have actually started ingesting posts from Mastodon (toots as they call them 😀).

We are looking to mix them in to the feed, and are doing some limited tests with that now. The good news is that when you reply to these, it should generally work to communicate across systems through ActivityPub.

We are hoping to use this to help connect Maven to a larger audience and a wider world.

This is also supported by looking at Maven’s staging environment, which has ActivityPub response data enabled. The goal for Maven is to federate to these posts back to the Fediverse for seamless communication, but the integration in their live environment seems to only go one way.

Mixed Expectations

On the one hand, Maven seems to have really dropped the ball here. One of the most important things about coming into this space as a developer is to communicate openly, and set expectations with the user community. A big part of the Fediverse cares deeply about consent, and the lack of any opt-in / opt-out mechanism feels like a missed opportunity.

On the other hand, we have to address the myths that crop up about privacy and content controls in the Fediverse. A lot of users have expectations about how their public content can be interacted with. Even 15 years in, we’re still not at a place where people have robust, conditional controls over who can view, interact with, or manipulate public content.

We also still don’t have great resources for setting cultural expectations for developers coming into the space. As we stated in our Content Nation article, most new developers have the ActivityPub spec, and little else. As a network, we need to take it upon ourselves to make our expectations front and center.

What Now?

Shortly after Liaizon made their post, Jimmy Secretan made an announcement on Mastodon that they’ve deleted the entirety of the import.

It’s clear from the feedback on this thread that even our experiments with the tech were confusing to users and didn’t fit with other people’s expectations of how it should work.

We are currently pausing this integration, at least until we can better understand how Maven can fit in as a good citizen of the Fediverse.

Searching within Maven’s app, it appears that thousands of Fediverse handles and posts are suddenly gone. This is a good development, but Maven probably has a long way to go before any part of the Fediverse will want anything to do with them.

Update

A short while ago, Jimmy Secretan posted this response on everything that happened today.

We have paused everything related to our Fediverse ingestion for now and we are removing everything ingested.

To be honest, the extreme negative reaction was a surprise to me, as I thought interaction between disparate systems was the entire point, but clearly we didn’t navigate the culture correctly.

The thing about participating in a social space, either as a user or as a service provider, is that you have to take the time to understand the norms. As a network of networks, those can be hard to pin from. Communicating your plans early and taking feedback go a long ways towards setting expectations. You can’t simply just implement a protocol and then pull in vast amounts of remote content to your network with no notice, and expect people to be okay with that.

I’ll leave you with this anonymous quote, since it feels appropriate: “Trust takes years to build, seconds to break, and a lifetime to repair.” If Maven wants to be a good steward of the Fediverse, it would be good for them to remember that.

Retraction Concerning Private Mentions

Update: an earlier version of this article stated that private DMs from Mastodon were mirrored onto Maven. Due to an extremely unfortunate set of circumstances, the user in question had accidentally originally created a public post before hitting the “Delete and Redraft” button. What ended up on Maven was the cached public copy that never got deleted from another Mastodon instance.

We apologize for this error, and have updated the story to set the record straight, after the admin who initially reported the issue uncovered new details.

Sean Tilley

Sean Tilley has been a part of the federated social web for over 15+ years, starting with his experiences with Identi.ca back in 2008. Sean was involved with the Diaspora project as a Community Manager from 2011 to 2013, and helped the project move to a self-governed model. Since then, Sean has continued to study, discuss, and document the evolution of the space and the new platforms that have risen within it.

Related Articles

36 Comments

  1. @nen Niinpä, sitä vaihtoehtoa artikkelikin mietti, mutta mitään selkeitä vastauksia siinä ei ollut. Screenshotit kuitenkin näyttivät, että yhden instanssin sisällä käyty privakeskustelu oli mennyt Mavenin puolelle.

  2. I understand you’re saying that this was a major violation of trust, but it seems like they are being accused of violating an unwritten set of norms. Maven followed the ActivityPub spec and the terms of service. They downloaded publicly accessible data using Mastodon servers and services as designed. They then analyzed that data and ran an algorithm to add labels, similar to how every fediverse server does. The difference here is that Maven used machine learning to add some labels, whereas others add labels such as timestamps when the local server downloads the data without using newer machine learning tech.

    Bing, DuckDuckGo, and Google also do this; they crawl the fediverse, use AI and machine learning to label content, and display it in different contexts.

    The tags that Maven adds are pretty innocent. They are just adding hashtag-like labels for discoverability.

    Furthermore, many people are upset that Maven is leaking people’s DMs. This is like living in a house where you refuse to have a front door or curtains on your windows and then getting very upset when somebody wanders in and sits down in your living room or looks in from across the street. The fediverse, by design, has no privacy. DMs are public! It says right there in Mastodon that these aren’t private. Nor are Bluesky’s DMs, by the way. There is no end-to-end encryption in the fediverse yet. Evan Prodromou is actually working on this, likely adapting the MLS standard, which is great but doesn’t exist yet.

    So my question is this: Why does the fediverse rely on unwritten and undocumented norms that are not mentioned in either the specs or terms of service? And why are people constantly surprised when others don’t follow these hidden social conventions?

    1. Hey Rabble, thanks for taking the time to share your thoughts!

      So, yeah, you hit the nail on the head: the primary problem, first and foremost, was a general lack of communication. A criticism that I myself hold is that we still rely on unspoken knowledge that lives in the heads of a few people. As I stated in some of my other articles, a big problem is the disconnect between community principles and expectations, and technical specifications. I’ll readily admit that people crossing their arms and saying “well, you should have known!” about some elusive subject is an objectively bad experience. An idea that I keep coming back to involves the possibility of launching a portal for prospective Fediverse developers to not only find the ActivityPub protocol and example code, but also explain some of the norms and expectations within the Fediverse, and some of the protocol extensions to ActivityPub that exist today.

      The main problem with this situation, in my opinion, is that Maven neglected to reach out to the community first, and study how the network operated, what the norms were, and how good stewards did things. This is not to say that they had to bend over backwards or anything, but: if you’re going to do something like ingest a million posts…maybe make yourself known beforehand, and explain what it is you’re doing. I can’t guarantee that there won’t be pushback, but it would’ve cemented at least a little bit of goodwill.

      Maven’s ActivityPub implementation was super janky, and the appearance of remote content that did not look or act like remote content led to a lot of confusion. The key concern about Maven isn’t just that they did all of this, ran a huge amount of data into their own platform, analyzed AI, and added metadata to posts that look like they came from Maven. The bigger concern is that, in addition to doing it without asking if anybody was cool with it, there’s a non-zero possibility that some of whatever was imported ended up in the training data that the AI uses for its algorithms. In practice, this isn’t a huge issue…but, when it comes to consent and varying perspectives on copyright, it’s kind of fucked up.

      As for Mastodon’s private messaging system: yeah, I hear you loud and clear. It sucks. I look forward to the day that something better comes along, and I’m hopeful about Evan Prodromou’s work with bringing E2EE to ActivityPub DMs. Sometimes, it’s situations like this that are necessary for us to realize how crappy some of our tech is, and that now is the time to iterate on something better.

      I’ll conclude by stating that I agree with you on all points about the underlying problems, and I think surfacing knowledge resources that are easy to access, and easy to understand, are vital. I do think that the Fediverse’s negative reaction is largely valid in this situation, but it may have more to do with the alignment of user consent and community biases against AI and Silicon Valley than almost anything else. Regardless, this is a situation where we actually have to put our money where our mouth is, and do something, rather than have a cycle that repeats over and over again.

      1. Not everyone using the fediverse agrees on how public data should be used. We shouldn’t assume there is a consensus just because a large group of the fediverse spoke loudly. I don’t think they did anything wrong, but I agree we should have better tools to control privacy if a user chooses to do so.

    2. > Furthermore, many people are upset that Maven is leaking people’s DMs. This is like living in a house where you refuse to have a front door or curtains on your windows and then getting very upset when somebody wanders in and sits down in your living room or looks in from across the street. The fediverse, by design, has no privacy. DMs are public!

      I think “DMs are public” is an exaggeration. Yes there are plenty of UX problems with it. No E2EE, which means the admin[s] of the instances involved in the conversation can read it if they want to, as well as other people getting easily added unintentionally to the mention-only conversation. I’ve come to accept these flaws and keep it in the threat model inside my mind. It’s just a janky email/IRC channel/unencrypted XMPP MUC.

      What I didn’t expect is how in the world Maven was able to access those mention-only posts when they are not mentioned at all AFAIK nor the admin of hackers.town as shown here. The Mastodon software of the instance shouldn’t have allowed Maven to fetch the post. There’s clearly a security bug in Mastodon’s implementation at least, and Maven didn’t even acknowledge that.

      1. Gargron has made a statement about this at https://mastodon.social/@Gargron/112608441965799612 and I withdraw my accusation about it being a security bug in Mastodon, which in hindsight may have been done hastily. This is most likely a delete-and-redraft situation where the redraft put it in mention-only visibility instead of the previous visibility being Public, and Maven probably didn’t implement honoring delete requests, therefore allowing the Public post that they’ve fetched before deletion to stay up.

  3. Looked into this yesterday and at least one Threads account listed over there, too. Not sure, if this was consensual.
    And I fully agree: It is not about not wanring to grow the fediverse but about at least getting notified when your content gets pulled to another platform.

  4. This is because most of instances are using Mastodon, which has no “private” setting. My instance runs on Pleroma, and is set as privte. So Maven tried to harvest, and got 401. This is what happens when you use a software which is not interested in granting security features.

    1. You can restrict the viewing of the local and federated timelines for logged-out users in Mastodon too. It’s not exclusive to *oma.

  5. @news Thanks for the write-up! Especially the leaked private message is crazy, and tells me that probably my Mastodon web intereface should not call it a private message… 😛

  6. @news@wedistribute.org not really a fan of people saying “DMs aren’t e2ee, so it’s perfectly fine”
    yes it’s a problem that they’re not e2ee but that doesn’t mean people are supposed to be fetching them as public
    if I sit on my front steps w/o locking the door and someone runs past me to go into my house, that’s still incredibly invasive?

    [edit: realized people will see this comment later, so clarifying, it turned out they did not have a DM, so they didn’t do any kind of interception.]

  7. @brettk I think we’re basically in agreement. But I think my main point is unlike any of those other projects, they aren’t fediverse citizens at all. You can’t be a citizen of the fediverse if you aren’t using/implementing the protocol the fediverse runs on.

  8. I feel sorry for Jimmy Secretan. The amount of hate and harassment developers get for trying to integrate the Fediverse is absolutely horrible, and the amount of misunderstanding — from multiple independent developers — shows that the loud minority on the Fediverse (the “Fediverse culture”) doesn’t actually gel well with how the rest of society works. Shame on you, mob.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button