Hello, all!

I’ve been using perchance’s ai-character-chat and text-to-image services for a few months now, and while I do highly enjoy them I do worry that much use could potentially harm the service, due to usage to ad ratio and otherwise. Beyond that point, I am a very private person and while I enjoy the service, I can’t help but worry that my messages and network requests could be viewed by my ISP or similar. While of course not illegal or anything of the sort, I still don’t enjoy the thought of my ISP or other people potentially viewing content that is not needed. Though, that may just be me and my paranoia talking. 😅

My resolution for this was to design my own local service, running entirely offline (for personal use only) but I’ve ran in to a problem. Out of all AI chat services I’ve seen, perchance has one of the best in terms of comprehension and response, in my personal opinion at least. While trying to replicate that, I can’t seem to get the responses to be nearly as good, regardless of parameters.

While this is an unusual request to ask, and of course completely up to your discretion, if you’re comfortable could you please share with me the details of the model used in your generation and parameter values/ranges? If not, completely okay, just a request to aid me in my personal project. So far, I’ve reached the conclusion of using a temperature range of 0.7-0.8 with Llama2-13B-GPTQ from TheBloke on HF.

  • @perchanceM
    link
    English
    2
    edit-2
    1 month ago

    Unfortunately a 13B model probably isn’t going to cut it. Perchance uses a popular open source 70B Llama-based model (you’ll come across it’s name almost immediately if you look at top model lists, but any of the top models will work fine - and you should use the recommended parameters in the HuggingFace repo). If you can’t run a 70B models, then I’d recommend these two places to find a 30B/20B/13B model to suit your specific use case, depending on your GPU size:

    This community is not well-suited to helping you get it set up, but the above two communities have lots of info.

    • @serethOP
      link
      English
      31 month ago

      Thank you for your reply! Yeah, I did have a feeling that I’d need to run a 70B Llama-based model, but I eventually ended up using a combination of 13B and 7B parameter models that dynamically switch, which somehow actually seems to work pretty good oddly enough. Your response was very helpful, and I appreciate your time to respond to this. <3