Hello Lemmings.

I will be attempting to make a federated anime tracker this summer, but I am not quite sure what features people would want and how I would get the details for animes, mangas, etc.

For the latter: What I thought was to either scrape other anime websites continuosly in the background, but this most likely is against the ToS of every anime tracking website, such as AniList or MAL. (I actually asked anidb.net for special access to their DB because apparently you can request access to it, but I’ve been left on read by the two staff members) My second idea was to make it an anime tracker website where animes are only user-submitted. And the user submissions would be approved by assigned moderators. However, I think this would be quite inconvenient. I’d like to get your opinions and/or ideas for this.

For the former: So if you have any requests or suggestions, please drop it down in the comments section.

Thanks in advance.

  • asudoxOP
    link
    79 months ago

    Is there a good reason why? Even MAL takes user submissions.

    • @[email protected]
      link
      fedilink
      89 months ago

      It’s a slow system. MAL takes user submissions because they already have a big database and those submissions help filling the cracks. If you don’t have a database to begin with it slows things down, especially if these submissions will go by trust-based. Doing this by randoms is also very risky.

      However, I’m not completely oppose to that idea because it can be helpful for some areas. My suggestion is start with a basic dog tag system, where the anime name, alternative names, status (airing, completed), season and start date, studios (also licensors and producers), age rating etc. These information needs to be scraped for the fastest way to form a quick database, they are publicly available (even on Wikipedia) so it should be fine to scrape. You can even go full Wikipedia after got only the names. User submissions could be useful for the introduction / summary parts of the titles at this stage. For only names (and basic tags), you can scrape AniDB from this list. It’s just a search query so shouldn’t be against their ToS.

      You can also check Kitsu for ideas, I like their DB request system. Pretty basic but can be done differently with the power of ActivityPub.

      • asudoxOP
        link
        2
        edit-2
        9 months ago

        I see. I can use the Jikan API to scrape animes and mangas which will take approximately 1-2 days after I get approval then. Oh and I forgot to mention, the federation part isn’t really how people think it will be I guess. The only federation that will be done will be the reviews, threads and the comments in them. With every anime/manga/vn, etc. being a new community which will contain those threads and reviews. And because of that, I don’t really know if this project is something people would want to self host. I guess I could provide full dumps of the database every month or something but I suppose that would be expensive. Then there are images as well, which will take hundreds of GBs easily even in compressed form.

        • @[email protected]
          link
          fedilink
          39 months ago

          Didn’t know about Jikan API. After a quick look at their docs, I think it should be a steady source for scraping.

          For features, can be done a lot with ActivityPub. Of course the most wanted features would be a watchlist / episode tracker (and possibly an importing from the lists people already have, I switched to Kitsu from MAL that way) but just thinking about federating the all anime/manga titles with basically their own communities out of the box sounds great. Good luck with the project!

          • asudoxOP
            link
            3
            edit-2
            9 months ago

            Yeah, though I am not sure how the federated instance admins would react. I am planning for every anime, manga, vn, etc. to have their own communities. This means about over 100k communities being made in an instant. Maybe instead of creating the communities in an instant, creating them when user activity first happens would be more fit. But this would also restrict other platforms’ users being able to comment on never heard or new anime entries until someone from the anime tracking platform comments or reviews them.

            Thanks btw.

            • @[email protected]
              link
              fedilink
              19 months ago

              Both have ups and downs. Assuming these lists will be in the code, do you have an estimation how big would that be? If you think they won’t strangle the code, just go with it. Something like storing them in JSON and loading them when needed could be better for optimization though.

              You can also do some best of both worlds, like not creating the communities beforehand but make the titles searchable from the database open for all users. That might require a bigger traffic from hosting side though, but it should be OK since these will be spread to all self hosted communities.

              I think you can also ask some of your questions to selfhosted communities.

              • asudoxOP
                link
                19 months ago

                What do you mean by “lists in the code”? Which lists and do you mean by “in the code” hardcoded?

                • @[email protected]
                  link
                  fedilink
                  29 months ago

                  Well, since 100k titles would make a pretty big database, at least storing the metadata like years, seasons, genres etc as hardcoded could make it run faster than going full-fledged JSON, at least I meant that. However this will be an open source project and there will be localizations, so now it doesn’t look like a good idea to me somehow.

                  • asudoxOP
                    link
                    29 months ago

                    What? What do you mean full-fledged JSON. I won’t be storing the animes in JSON, but in a PostgreSQL DB. I don’t understand what you mean…

        • @[email protected]
          link
          fedilink
          39 months ago

          Noticed the edit:

          For hosting images, you can go alternatives like some Lemmy sites do: Mirror everything automatically to Internet Archive.

          I think people would want to self host, because they will get all the anime/manga titles with their communities out of the box and can moderate their own sites while their users can react to any other community via federation.

          • asudoxOP
            link
            49 months ago

            Wouldn’t Internet Archive be a bit slow? And also, I don’t want to stress their servers.

            • @[email protected]
              link
              fedilink
              29 months ago

              That’s a noble concern. I just gave an example there since some communities do that but yeah it would be better if not done I guess.

              Also mentioned on other reply but you can ask this to [email protected] and probably will get an optimized answer regarding that issue.

    • @iopq
      link
      19 months ago

      I think moderator approval is a bit slow

      • asudoxOP
        link
        29 months ago

        It is, but I have no choice other than that if I can’t scrape websites.

        • @iopq
          link
          19 months ago

          Could be like anyone can post, but if it’s downvoted it’s hidden, Reddit style

          • asudoxOP
            link
            29 months ago

            That seems like a good idea. I’ll keep it in mind.